In particular, note that arrange(df, x) will sort x using the C locale if it is a character vector. But arrange(df, -desc(x)) (i.e. invert the desc() call, giving you the original order in theory) will sort x using the user's locale.
Normally a call like desc(x) is recognized and we don't even actually call desc() under the hood, we translate it to a "desc" value for the directions argument of vec_order_radix(), but in this case the - interferes and we actually evaluate the call.
That ends up calling desc() which does -xtfrm(x), and xtfrm() ends up using base::order(x), utilizing the user's locale.
I don't think we should remove usage of xtfrm() in desc(), since that is a generic that people have probably written S3 methods for, but maybe we can have special behavior for unclassed character vectors where it utilized vec_rank() instead (which uses the C locale)? It would not be a perfect fix, but it may be good enough.
See https://github.com/tidyverse/dplyr/issues/7044
In particular, note that
arrange(df, x)
will sortx
using the C locale if it is a character vector. Butarrange(df, -desc(x))
(i.e. invert thedesc()
call, giving you the original order in theory) will sortx
using the user's locale.Normally a call like
desc(x)
is recognized and we don't even actually calldesc()
under the hood, we translate it to a"desc"
value for thedirections
argument ofvec_order_radix()
, but in this case the-
interferes and we actually evaluate the call.That ends up calling
desc()
which does-xtfrm(x)
, andxtfrm()
ends up usingbase::order(x)
, utilizing the user's locale.I don't think we should remove usage of
xtfrm()
indesc()
, since that is a generic that people have probably written S3 methods for, but maybe we can have special behavior for unclassed character vectors where it utilizedvec_rank()
instead (which uses the C locale)? It would not be a perfect fix, but it may be good enough.