sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
476 stars 79 forks source link

Confusing diagnostic output with `--singleton` #1384

Open camillescott opened 3 years ago

camillescott commented 3 years ago

When running sourmash sketch dna --singleton [input.fa], the output is somewhat confusing:

== This is sourmash version 4.0.0. ==
== Please cite Brown and Irber (2016), doi:10.21105/joss.00027. ==

computing signatures for files: /data/store/public/transcriptomes/Sac_pom_pombase_2017/sacPom.pombase.fa
Computing a total of 1 signature(s).
calculated 5138 signatures for 5138 sequences in /data/store/public/transcriptomes/Sac_pom_pombase_2017/sacPom.pombase.fa
saved signature(s) to sacPom.pombase.fa.sig. Note: signature license is CC0.

Namely, the "computing a total of 1 signatures(s)," which is then only followed up by "calculated N signatures for ..." once complete. This is a bug / confusion report by proxy :)

ctb commented 3 years ago

thanks camille, I've been noticing problems too, with various edge-y cases. (Try feeding a blank FASTA file in and marvel at the output 😂). I think I'll probably revisit this as part of code re-org around removing sourmash compute https://github.com/dib-lab/sourmash/issues/1286.