lh3 / seqtk

Toolkit for processing sequences in FASTA/Q formats
MIT License
1.37k stars 308 forks source link

strange output by seqtk trimfq #83

Closed zengfengbo closed 8 years ago

zengfengbo commented 8 years ago

given a fastq file.

$ less WS-044-HGF_R1.fastq_1.gz|head -8
@NB501129:42:HJTWNBGXY:1:11101:3854:1054 1:N:0:AGATCG
GTGCCNCACTTATACTGCAGCTGAGGGACCATGGAAAGCAGCCCACCCTGAGTTTTATACTCANGGGNGNNNNNNC
+
AAAAA#EEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE#EEE#A######E
@NB501129:42:HJTWNBGXY:1:11101:7808:1055 1:N:0:AGATCG
GGTGGNTGCCACTACTGCCTGGCTAATTTTTGTATTTTTATTAGAGATGGGGTTTCACCATGCNGGCNANNNNNNT
+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE<#AEE#A######E

I want trim 0bp from the fastq file.

$ less WS-044-HGF_R1.fastq_1.gz|head -8|seqtk trimfq -b 0  -
@NB501129:42:HJTWNBGXY:1:11101:3854:1054 1:N:0:AGATCG
CACTTATACTGCAGCTGAGGGACCATGGAAAGCAGCCCACCCTGAGTTTTATACTCA
+
EEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE
@NB501129:42:HJTWNBGXY:1:11101:7808:1055 1:N:0:AGATCG
TGCCACTACTGCCTGGCTAATTTTTGTATTTTTATTAGAGATGGGGTTTCACCATGC
+
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE<

the tail Ns had trimed.

add -q 0.99 option

$ less WS-044-HGF_R1.fastq_1.gz|head -8|seqtk trimfq -b 0 -q 0.99 -
@NB501129:42:HJTWNBGXY:1:11101:3854:1054 1:N:0:AGATCG
GTGCCNCACTTATACTGCAGCTGAGGGACCATGGAAAGCAGCCCACCCTGAGTTTTATACTCANGGGNGNNNNNNC
+
AAAAA#EEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE#EEE#A######E
@NB501129:42:HJTWNBGXY:1:11101:7808:1055 1:N:0:AGATCG
GGTGGNTGCCACTACTGCCTGGCTAATTTTTGTATTTTTATTAGAGATGGGGTTTCACCATGCNGGCNANNNNNNT
+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE<#AEE#A######E

normal, no base trimed.

add -q 0.09 option

$ less WS-044-HGF_R1.fastq_1.gz|head -8|seqtk trimfq -b 0 -q 0.09 -
@NB501129:42:HJTWNBGXY:1:11101:3854:1054 1:N:0:AGATCG
GTGCCNCACTTATACTGCAGCTGAGGGACCATGGAAAGCAGCCCACCCTGAGTTTTATACTCA
+
AAAAA#EEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE
@NB501129:42:HJTWNBGXY:1:11101:7808:1055 1:N:0:AGATCG
GGTGGNTGCCACTACTGCCTGGCTAATTTTTGTATTTTTATTAGAGATGGGGTTTCACCATGC
+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE<

trim some base.

lh3 commented 8 years ago

Options -b and -q don't work together. This is a limitation of trimfq. Sorry, no easy solutions.