sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
473 stars 80 forks source link

`--from-file` vs pathlists - can we unify the way they work? #1665

Open ctb opened 3 years ago

ctb commented 3 years ago

We have a few different issues around adding --from-file to new commands, e.g. https://github.com/sourmash-bio/sourmash/issues/1631.

As I was implementing #1657 for a different reason, though, I realized that pathlists and --from-file seem completely redundant. If true, it's kind of embarrassing that I didn't realize it before :).

However, even if they end up doing the same thing, there is a big difference in the user experience for these two. --from-file adds all the lines from the specified file onto the argument list, while pathlists are loaded as a single MultiIndex database containing all of the different indexes etc.

I'm wondering if there's a way to modify _load_database to provide the same UX; maybe we could modify the loading functions to yield multiple databases, rather than just one, and have pathlist load yield multiple dbs?

ctb commented 2 years ago

related: https://github.com/sourmash-bio/sourmash/issues/1878 and https://github.com/sourmash-bio/sourmash/issues/1877

ctb commented 2 years ago

consider removing pathlists altogether per https://github.com/sourmash-bio/sourmash/issues/1414#issuecomment-1203801586