sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
473 stars 80 forks source link

example: using GNU parallel to sketch signatures in parallel. #1796

Open ctb opened 2 years ago

ctb commented 2 years ago

The following will use GNU parallel to calculate sketches in parallel - 8 at a time.

for i in *.fa
do
   echo sourmash sketch dna $i -o demo/$i.sig
done | parallel -j 8

Note, GNU parallel can be installed with conda 🎉 .

ref https://github.com/sourmash-bio/sourmash/issues/638.

hyphaltip commented 2 years ago

also in one line:

mkdir -p demo
parallel -j 8 sourmash sketch dna {} -o demo/{}.sig ::: $(ls *.fa)
ctb commented 1 year ago

we now have two plugins that can do this too:

https://github.com/sourmash-bio/sourmash_plugin_sketchall - in python

https://github.com/sourmash-bio/pyo3_branchwater command manysketch