Bioconductor / MatrixGenerics

S4 Generic Summary Statistic Functions that Operate on Matrix-Like Objects
https://bioconductor.org/packages/MatrixGenerics
12 stars 1 forks source link

Heads up: New `useNames` argument coming #23

Closed HenrikBengtsson closed 3 years ago

HenrikBengtsson commented 3 years ago

Hi. We're gonna introduce a new argument useNames=NA/FALSE/TRUE in several matrixStats functions. We just added a 100%-backward compatible useNames=NA placeholder and rerun 347 revdep checks.

It turns out MatrixGenerics has package tests (e.g. https://github.com/Bioconductor/MatrixGenerics/blob/9a5b3671e67dc2164666e7266d096004b9d78d1d/tests/testthat/test-api_compatibility.R#L1240-L1242) that trigger errors on this. We get https://github.com/HenrikBengtsson/matrixStats/blob/feature/useNames%3DNA/revdep/problems.md#matrixgenerics, e.g.

 ERROR
Running the tests in ‘tests/testthat.R’ failed.
Last 50 lines of output:
  Length mismatch: comparison on first 6 components
  ── Failure (test-api_compatibility.R:1242:2): rowProds works  ──────────────────
  `matrixStats_formals` not identical to `MatrixGenerics_default_method_formals`.
  Length mismatch: comparison on first 6 components

Not sure how you want to deal with this. I guess you could just wait until matrixStats is updated on CRAN and fix it in Bioc release and devel when it breaks, or update these tests to be agile to the new useNames argument. FWIW, the plan is to roll a version with useNames=NA (100% backward compatible), and then further down the road make the new default useNames=TRUE.

PeteHaitch commented 3 years ago

Thanks for the heads up, Henrik. Later in the week I'll have a think about how to handle the migration in BioC release (BioC devel can either use the same approach or just be broken for a day or 2 around the window matrixStats is updated on CRAN)

HenrikBengtsson commented 3 years ago

UPDATE: matrixStats 0.60.0 is now on CRAN with a new useNames=NA argument for many function.

PeteHaitch commented 3 years ago

Thanks, @HenrikBengtsson. I was on leave last week but I have now addressed this in the devel and release branches by bumping the minimum matrixStats version to v0.60.0 and making the necessary changes to the generics and methods.

Could you please clarify why matrixStats::colQuantiles()/matrixStats::rowQuantiles() have useNames = TRUE while all other functions that had useNames added have useNames = NA?

HenrikBengtsson commented 3 years ago

Could you please clarify why matrixStats::colQuantiles()/matrixStats::rowQuantiles() have useNames = TRUE while all other functions that had useNames added have useNames = NA?

That's a mistake; it was meant to be useNames = NA like everything else. We're working on implementing useNames support in native code - in the current release it was implemented using R. This mistake will be fixed in the next release (https://github.com/HenrikBengtsson/matrixStats/issues/208).

PeteHaitch commented 3 years ago

No worries, thanks for the clarification. I think I'll keep the mistake and change it when the corresponding update is made to matrixStats.

I need to familiarise myself with the work you are doing regarding useNames - are there particular branches I should be following?

const-ae commented 3 years ago

Hi Pete,

was it a deliberate choice to fix useNames = NA when MatrixGenerics passes on the call to matrixStats instead of useNames = useNames?

https://github.com/Bioconductor/MatrixGenerics/blob/16ebd0e0197d4985f98a449563e212ee9bfbc17f/R/rowSums2.R#L60-L62

PeteHaitch commented 3 years ago

Hi @const-ae,

No it wasn't a deliberate choice, I think I just did it in haste :/ Perhaps I should change that. Is that what you think?

PeteHaitch commented 3 years ago

I've also now spotted a few places where in my haste I didn't pass on useNames at all, e.g., https://github.com/Bioconductor/MatrixGenerics/blob/2953ce705476b5135cda21c4616acab7de6c3edc/R/rowAnys.R#L40-L42

const-ae commented 3 years ago

Yes, I think that should be changed. But I don't think it's urgent, I just noticed it while trying to add the useNames parameter to sparseMatrixStats.

PeteHaitch commented 3 years ago

I'm going through all these .matrixStats_foo() functions to check that useNames is properly passed on.

const-ae commented 3 years ago

Yeah, maybe this could actually be a good motivation to add some unit tests that automatically compare if the signature of the .matrixStats_[XXX] functions are compatible with the generics.

If I get the sparseMatrixStats stuff done quickly, I could give this a stab :)

PeteHaitch commented 3 years ago

That's a good idea. I don't have time for it right now but I'll push the code changes within the hour.

PeteHaitch commented 3 years ago

Although it's not actually the signature of the .matrixStats_[XXX] functions but the internals where the problem was ...

PeteHaitch commented 3 years ago

Double checked and fixed these in devel and release. Should be available in a day or two. Thanks again, @const-ae!

const-ae commented 3 years ago

FYI, in https://github.com/const-ae/sparseMatrixStats/commit/eaab67410f5c630ea9cec8900c3f62287a0d630d I added support for useNames to sparseMatrixStats