r-gregmisc / gtools

Functions to assist in R programming
25 stars 6 forks source link

mixedsort with e/E digit #7

Closed chunhungChou closed 3 years ago

chunhungChou commented 3 years ago

For character strings containing E or e, the sort function regard it as scientific expression number. Therefore the sorted result not the desired. For example, the input vector tmp1 <- c("AA1CD23-01A1", "AA1CD23-02G7", "AA1CD23-03G2", "AA1CD23-04F5", "AA1CD23-05F0", "AA1CD23-06E3", "AA1CD23-07D6", "AA1CD23-08D1", "AA1CD23-09C4", "AA1CD23-10D1", "AA1CD23-11C4", "AA1CD23-12B7", "AA1CD23-13B2", "AA1CD23-14A5", "AA1CD23-15A0", "AA1CD23-16G6", "AA1CD23-17G1", "AA1CD23-18F4", "AA1CD23-19E7", "AA1CD23-20F4", "AA1CD23-21E7", "AA1CD23-22E2", "AA1CD23-23D5", "AA1CD23-24D0", "AA1CD23-25C3") after mixedsort

mixedsort(tmp1) [1] "AA1CD23-21E7" "AA1CD23-19E7" "AA1CD23-06E3" "AA1CD23-22E2" "AA1CD23-25C3" [6] "AA1CD23-24D0" "AA1CD23-23D5" "AA1CD23-20F4" "AA1CD23-18F4" "AA1CD23-17G1" [11] "AA1CD23-16G6" "AA1CD23-15A0" "AA1CD23-14A5" "AA1CD23-13B2" "AA1CD23-12B7" [16] "AA1CD23-11C4" "AA1CD23-10D1" "AA1CD23-09C4" "AA1CD23-08D1" "AA1CD23-07D6" [21] "AA1CD23-05F0" "AA1CD23-04F5" "AA1CD23-03G2" "AA1CD23-02G7" "AA1CD23-01A1"

Any suggestion to overcome this issue?

warnes commented 3 years ago

I added the parameter scientific to mixedsort to handle control whether exponential notation are recognized for numeric values.

> ## Control scientific notation for number matching: 
> vals <- c("3E1","2E3", "4e0")
> 
> mixedsort(vals) # With scientfic notation
[1] "4e0" "3E1" "2E3"
> mixedsort(vals, scientific=FALSE) # Without scientfic notation
[1] "2E3" "3E1" "4e0"