Open osevill opened 1 year ago
An option is to rename the columns if you know the column position, then sort by the name you choose.
See https://miller.readthedocs.io/en/latest/csv-with-and-without-headers/
cat unknown_col.csv
abc, xxx_like, yyy_unlike
10, 1, z
11, 2, y
12, 3, x
Processing
tail -n +2 unknown_col.csv | mlr --csv --implicit-csv-header label a,xxx,yyy then sort -f yyy,xxx
a,xxx,yyy
12, 3, x
11, 2, y
10, 1, z
Thanks for adding this in v6.11!
Test file: reorder_regex_test_2.csv
I'm testing regex support for the reorder verb, and noticing unexpected behavior.
For the attached file, why does this give the expected results:
mlr --c2p reorder -f 'aaa_aaa','ccc_aaa','bbb_aaa' ./reorder_regex_test_2.csv
but this doesn't: (changing the -f to -r)
mlr --c2p reorder -r 'aaa_aaa','ccc_aaa','bbb_aaa' ./reorder_regex_test_2.csv
In the second expression, column order of the results is 'bbb_aaa' 'aaa_aaa' 'ccc_aaa'
I tried this first and also had unexpected results:
mlr --c2p reorder -r '^aaa.*$','^ccc.*$','^bbb.*$' ./reorder_regex_test_2.csv
..with results in a similar column order... '^bbb.*$','^aaa.*$','^ccc.*$'
Providing just one regex expression seems to work fine however:
mlr --c2p reorder -r '^aaa.*$' ./reorder_regex_test_2.csv
Am I using incorrect syntax to combine the regex fields? Apologies if I'm missing something obvious.
Would it be possible to allow regex matching when reordering column headers of a csv file? The documentation describes reorder as requiring the specific field names, e.g.,
"i"
and"b"
inmlr --opprint reorder -f i,b data/small
My use case is that I don't necessarily know the exact field names, but I know that some will start with prefix XXX and other with YYY, and I would like to be able to reorder so that any (or 0) fields starting with YYY come first, followed by any (or 0) that start with XXX.
Thanks!