Closed arendsee closed 7 years ago
Thank you @arendsee for catching this. Are you using the most recent commit? I am currently TAing a lab, but will look in to this when it is over and should have a fix pushed by the end of the day.
I am using the most recent commit from the master branch (I installed from source)
I tested this commit on fasta and fastq files and it seems to be working correctly. You no longer need to indicate if you are using a fastq file and parsing should be ~10-20X faster for fastq files. We will be implementing a test suite soon to hopefully catch these kinds of errors before pushing them. Thanks again @arendsee for catching this error and taking the time to report it.
No problem. I have a little FASTA program of my own, smof that is pretty similar to FAST
, if you are interested in taking a look. I've just been drafting a comparison of the two (in the README).
smof looks like a nice set of utilities. We do have significant overlap in functionality along with useful unique utilities. It is great that our tools are interoperable with so we can take advantage of unique features in both.
Currently, our goals are speeding up the FAST utilities within the limits of Perl. We are replacing most of the bioperl code, which has provided a 2X speed increase on fasta files and ~10-20X speed increase on fastq files.
That should make the speed comparable to smof
. You could also add support for indexed FASTA files.
Given the FASTA file:
I get the following result:
faswc
appears to only be considering every other entry. Perhaps it is getting mixed up between FASTA and FASTQ format?