Open mr-eyes opened 1 year ago
parallel filtration of sourmash sigs by abundance
FROM_DIR=sigs
TO_DIR=sigs_abund2
ls ${FROM_DIR}/*sig | parallel -j 16 'sig={}; newsig=$(basename $sig .sig); sourmash sig filter -k 51 --min-abundance 2 $sig -o ${TO_DIR}/${newsig}.sig'
parallel downsampling and filtration of sourmash signatures on abundance (piping sourmash commands)
ls sigs/*sig | parallel -j 100 'sig={}; newsig=$(basename $sig .sig); sourmash signature downsample -q $sig --scaled 100000 -k 51 -o - | sourmash signature filter --min-abundance 2 - -o ${newsig}.sig'
Awesome, thanks! But... why such high -j
values?? Surely with I/O they merely lead to more thrashing?
Awesome, thanks! But... why such high
-j
values?? Surely with I/O they merely lead to more thrashing?
I am just showing examples on how to run. However, when I tried once high number of cores (128) it worked very well (for super small sigs).
Multithreaded renaming for all sourmash signatures in a directory to their file names.