sourmash-bio / sourmash_plugin_branchwater

fast, multithreaded sourmash operations: search, compare, and gather.
GNU Affero General Public License v3.0
14 stars 2 forks source link

support standalone manifests containing zip files #266

Open ctb opened 4 months ago

ctb commented 4 months ago

build a manifest containing zip files with sig collect:

sourmash sig collect podar-ref/1.fa.sig.zip -o manifest-with-zipfiles.csv -F csv

try to run manysearch on it:

sourmash scripts manysearch manifest-with-zipfiles.csv manifest-with-zipfiles.csv -o xxx.csv

and you will get:

Reading query(s) from: 'manifest-with-zipfiles.csv'
Loaded 1 query signature(s)
thread '<unnamed>' panicked at src/manysearch.rs:32:90:
called `Result::unwrap()` on an `Err` value: Error: Failed to load query record: CP001941.1 Aciduliprofundum boonei T469, complete genome
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/Users/t/miniforge3/envs/py311/bin/sourmash", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/t/dev/sourmash/src/sourmash/__main__.py", line 20, in main
    retval = mainmethod(args)
             ^^^^^^^^^^^^^^^^
  File "/Users/t/dev/pyo3_branchwater/src/python/sourmash_plugin_branchwater/__init__.py", line 71, in main
    status = sourmash_plugin_branchwater.do_manysearch(args.query_paths,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Error: Failed to load query record: CP001941.1 Aciduliprofundum boonei T469, complete genome

Also note that we should load files from within manifests as if they are relative to the manifest dir per https://github.com/sourmash-bio/sourmash/pull/3054 and https://github.com/sourmash-bio/sourmash/issues/3008#issuecomment-1975174211.

Originally noted in https://github.com/sourmash-bio/sourmash_plugin_branchwater/issues/237.

ctb commented 3 weeks ago

note: #364 temporarily removes the text from the docs that suggests using manifest CSVs.