sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
473 stars 80 forks source link

update Storage class to provide special methods for manifest loading and signature listing #1757

Open ctb opened 3 years ago

ctb commented 3 years ago

per #1598,

@ctb says:

in ZipFileLinearIndex, we have to break Storage encapsulation and go directly to the zipfile to get a list of all files, in cases where there is no manifest. (Listing all files is not an option supported through the current Storage interface.) There are two options for fixing this -

  1. provide a way to list all (relevant) files in a Storage. It's not clear to me that this is a good idea, because a Storage may contain many files and many signatures.
  2. require manifests when using zipfile collections. This seems fine to me, but would (I think) require a 5.0 release.

Ultimately I think we will probably want to implement both, but not today :)

See also discussion in https://github.com/sourmash-bio/sourmash/issues/1441, and issue to require manifests in zipfiles https://github.com/sourmash-bio/sourmash/issues/1755.

ctb commented 3 years ago

also see notebook in https://github.com/sourmash-bio/sourmash/pull/1758, and comments therein.