ksahlin / strobealign

Aligns short reads using dynamic seed size with strobemers
MIT License
141 stars 17 forks source link

command line syntax #442

Closed jtamames closed 5 days ago

jtamames commented 5 days ago

Hello This is likely a silly question, but I am having problems with the command to run the program as you put it:

strobealign -t 60 01.H1.fasta H1.current_1.gz H1.current_2.gz | samtools sort -o sorted.bam the result is:

Usage: samtools sort [options] <in.bam> <out.prefix> Could you help me to fix the syntax? Thanks a lot! Best, J

marcelm commented 5 days ago

You’re probably using a (very) old samtools version. I strongly recommend that you update, but in the meantime, I think you need to use both samtools view and samtools sort like this:

strobealign -t 60 01.H1.fasta H1.current_1.gz H1.current_2.gz | samtools view -Sb - | samtools sort -f - sorted.bam

Note that the current release of strobealign will likely not be able to use all 60 provided threads; this has only recently been fixed in the development version.

jtamames commented 5 days ago

Thanks for the answer! I thougth it was using the samtools in your conda distro, but it is using the one in my system instead, that probably is rather old, indeed. Best, J

marcelm commented 5 days ago

Good point. Samtools isn’t listed as a dependency of the strobealign Conda package, so just running conda install strobealign will not install it. I’ll update the installation instructions.

marcelm commented 5 days ago

Note to whoever stumbles over this issue: I just noticed the command I suggested above won’t work. It’s better to upgrade.

jtamames commented 5 days ago

Thank you Marcel. Regarding this:

Note that the current release of strobealign will likely not be able to use all 60 provided threads; this has only recently been fixed in the development version.

which is the actual limit of threads that strobealign can manage? Best, J

marcelm commented 5 days ago

Quoting from the changelog of the unreleased version:

269, #418: Strobealign scales now much better to systems with many cores. Previously, decompressing gzipped-compressed input files was a bottleneck starting at about 30 threads. We now use ISA-L for decompression, which is about three times as fast as zlib, and decompression is also done in a separate thread. We tested up to 128 cores, and strobealign was still able to use all cores. Contributed by @telmin.

@ksahlin What do you think about making a new release?

ksahlin commented 5 days ago

A new release sounds good. Thanks Marcel!