Closed fsnibs10 closed 7 months ago
Thanks for reporting this. This bug was introduced in v2.7.0. It's fixed now.
Besides, after checking the code, I think I can make it faster.
Use this, it's slightly faster.
Thanks! I have tested this improved version with the same dataset. It takes about 20 seconds, very fast.
@fsnibs10 Sorry, the previous changes introduced a bug. see #457 . It occurred when more than 2 pairs of primers were given.
Hi developers,
Recently, I used seqkit amplicon to extract the target sequence from the compressed FASTQ file by giving primer sequence. I downloaded the latest version (v2.7.0) and the previous version (v2.3.0).
The download command is shown below.
wget https://github.com/shenwei356/seqkit/releases/download/v2.3.0/seqkit_linux_amd64.tar.gz
wget https://github.com/shenwei356/seqkit/releases/download/v2.7.0/seqkit_linux_amd64.tar.gz
I found that runtime of amplicon module in the latest version (seqkit v2.7.0) is much longer than seqkit v2.3.0. The file size of the sequencing data is about 550Mb, including 5671607 reads. With the same command and server computer, seqkit amplicon v2.3.0 runs very fast, taking 20 seconds. While the execution time of version 2.7.0 is about 7 minutes. I don't know why. My command is shown bleow.
seqkit amplicon --threads 8 -F AAGAGTGGAG -R GTTCATCC -o sample.read1.fq read1.fq.gz