sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
473 stars 80 forks source link

rationalize column headers across CSV output files #1555

Open ctb opened 3 years ago

ctb commented 3 years ago

prefetch and gather use different CSV column headers for their output, and I'm sure so does search and other commands.

Would be good to fix this in v5.

ctb commented 2 years ago

might want to include duplicate columns for backwards compat... everything in genome-grist and charcoal will break if we swizzle the names 😆

specific example: gather output has name, prefetch output has match_name. Grr.

ctb commented 2 years ago

I was thinking that a ~clean way to do this would be to add two options, one to use current column names (default in v4.x) and one to switch column names (new); then swizzle the defaults for v5, and deprecate the current column names in 5; and then remove the old column names in v6.

ctb commented 2 years ago

starting to work on this: https://hackmd.io/EdrdrkByQ7a2GuyiN5oyGQ and https://github.com/sourmash-bio/sourmash/pull/2274

ctb commented 1 year ago

note, docs requested for ANI columns 😓 https://github.com/sourmash-bio/sourmash/issues/2367#issuecomment-1319142438

bluegenes commented 7 months ago

note that we should now also update these in the rust layer, e.g. https://github.com/sourmash-bio/sourmash/blob/latest/src/core/src/index/mod.rs#L57-L64