const-ae / sparseMatrixStats

Implementation of the matrixStats API for sparse matrices
Other
54 stars 3 forks source link

colProds and rowProds should accept method= argument #7

Open LTLA opened 4 years ago

LTLA commented 4 years ago

Even if it doesn't do anything with them. Just add a ... to both method definitions so that they don't throw when someone passes in method= in a generic.

LTLA commented 4 years ago

Same for type= for rowQuantiles and colQuantiles, though in this case, it seems like it would be best to respond to that choice; you might just copy the loop over rows/columns in matrixStats.

LTLA commented 4 years ago

Similarly, colRanks has preserve.shape= instead of preserveShape=.

Why not put these arguments in the generic so that they are guaranteed to be consistent across methods?

const-ae commented 4 years ago

I added the ... to [col|row]Prods() and also fixed the preserveShape argument name in colRanks.

Why not put these arguments in the generic so that they are guaranteed to be consistent across methods?

matrixStats supports some method arguments

that seemed like low-level optimization details specific for matrixStats.

I am aware that this can be surprising, because it means that sparseMatrixStats is not a perfect drop-in replacement.

If I had to rank: I am fairly certain that dim. is something specific for matrixStats and should not appear in the generic. I think the case for method is similar. I am less sure about preserveShape and the more I think about drop, I actually believe this should be in the generic.

Maybe we should rather discuss this in the MatrixGenerics repo, so that Pete can chime in as well :)

Same for type= for rowQuantiles and colQuantiles, though in this case, it seems like it would be best to respond to that choice; you might just copy the loop over rows/columns in matrixStats

I am undecided what to do about this one, because currently I am relying on my own quantile_sparse function. I could of course expand every column to a dense vector and call the quantile(type=...) function, however I would really like to avoid this. The alternative, is to actually understand how the type arguments differ and add that functionality to the quantile_sparse() function. Maybe for the mean time, I should just stop() if type != 7.

const-ae commented 4 years ago

Okay, I decided that I am fine for now with densifying each column if colQuantiles() is called with a non-default type argument.

I opened a PR at https://github.com/Bioconductor/MatrixGenerics/pull/14 to include drop and type in the signature of the generic method.

const-ae commented 4 years ago

I also realized why preserveShape cannot be in the generic signature of colRanks:

If the generic would look like this:

setGeneric("colRanks", function(x, rows = NULL, cols = NULL, ties.method = c("max", "average"), preserveShape = FALSE ...) standardGeneric("colRanks"),
           signature = "x"
)

The code fails because the .default_colRanks function here would be called with dim. = preserveShape.

If the preserveShape argument is moved behind the ..., I get a mismatch between the matrixStats and the MatrixGenerics method signature.

However, I could also be missing something here, so if there is a possibility to include preserveShape without including dim., I would be up to changing the method signature :)

LTLA commented 4 years ago

An expedient solution might be to swap the order of preserveShape and dim. in .default_colRanks.

const-ae commented 4 years ago

Good point, but I worry that this would be even more surprising.

LTLA commented 4 years ago

Seems like the choice is between that of developer-level surprise, where we have to make sure that the call inside .default_colRanks swaps the arguments; or user-level surprise, where people might start wondering where preserveShape= went and if it is supported in all methods for this generic. Given that preserveShape= actually affects the shape of the output, avoiding the latter appears to be more important.