Open HenrikBengtsson opened 3 years ago
Hi Henrik, thanks for the heads-up.
Could you say a bit more what is the advantage of using ties.method = "average"
as the default and from now on forcing people to explicitly specify ties.method
? I often use the ranks
function without explicitly setting ties.method
, because I don't really care what happens with the ties or know that I don't have any. And only if I realize that the ties are important, I specifically choose theties.method
that matches my needs.
And I assume, that I will also need to duplicate the efforts to warn people about not specifying ties.method
to stay aligned with matrixStats
and avoid suddenly breaking code when you do make the switch in step 3, is that correct?
Could you say a bit more what is the advantage of using ties.method = "average" ...
Only to align with base::rank()
. The existing discrepancy in matrixStats may result in incorrect results due to assumptions about the default.
... from now on forcing people to explicitly specify ties.method?
Only until we get to Step 3, when ties.method = "average"
(= formals(base::rank)$ties.method
) becomes the new default. After that, we'll drop the requirement of having to specify ties.method
.
A more abrupt approach would be to just switch the default, but I'm too conservative for such a silent change.
And I assume, that I will also need to duplicate the efforts to warn people about not specifying ties.method to stay aligned with matrixStats and avoid suddenly breaking code when you do make the switch in step 3, is that correct?
Yes, I think so. You could keep your current default, if you'd like, or take the same deprecate-defunct-change_default approach as I do above.
Note that this is an extremely rare event, but I think it's worth fixing it, especially for future generations. FWIW, in hindsight, we should have made ties.method = "average"
the default when we started out many years ago and just have given an error "ties.method = "average' is not yet supported". Then we wouldn't have to do this complicated update process.
Clear as mud?
Background
matrixStats uses
ties.method = "max"
as the default forcolRanks()
androwRanks()
for legacy reasons, but we want eventually update toties.method = "average"
to align it withbase::rank()
, cf. https://github.com/HenrikBengtsson/matrixStats/issues/142.The process for this migration with be:
ties.method
is not explicitly specified (long time; several releases)ties.method
is not explicitly specified (long time; several releases)ties.method = "average"
This will have to take a long time in order to make sure end-users out there will notice this and update their code. I hope this will minimize the risk for existing code all of a sudden start producing different results.
Issue
sparseMatrixStats gives an ERROR when I revdep check asserting
!isTRUE(missing(ties.method))
, cf. https://github.com/HenrikBengtsson/matrixStats/blob/feature/default-rank-ties.method/revdep/R_MATRIXSTATS_TIES_METHOD_MISSING%3Ddefunct/problems.md#sparsematrixstats