Closed alexyfyf closed 1 year ago
Hi @alexyfyf
It appears that length filtering with -z
only works if you also use -l <path_to_len_filtered.fastq>
Will make it so -l
is not required for the next release.
Cheers,
Neil
Thank you Neil. If I apply -l
I can see the filtering been done, but the filtered out number is not correct.
The code I use is
pychopper -t $num_cores -Q 7 -z $zvalue \
-r ${base}_report.pdf -S ${base}_statistics.tsv \
-l ${base}_lengthfail.fastq \
-w ${base}_rescued.fastq \
${raw_reads} ${base}_fulllength.fastq
For example if I gave -Q 7 -z 50
I have
Finished processing file: /home/users/fastq_ont/SGNex_K562_directcDNA_replicate2_run2/SGNex_K562_directcDNA_replicate2_run2.fastq.gz
Input reads failing mean quality filter (Q < 7.0): 127132 (45.37%)
Output fragments failing length filter (length < 50): 4862
and if I gave -Q 7 -z 150
I have
Finished processing file: /home/users/fastq_ont/SGNex_K562_directcDNA_replicate2_run2/SGNex_K562_directcDNA_replicate2_run2.fastq.gz
Input reads failing mean quality filter (Q < 7.0): 222838 (79.53%)
Output fragments failing length filter (length < 150): 3428
These are running on exactly same input. I also manually calculated the quality score with seqkit
and found indeed 222838 sequences have q<7.
seqkit seq --max-qual 7 $raw_reads | seqkit fx2tab | wc -l
222838
But I'm not sure how the first result came out.
Could you have a look?
Thank you so much.
Cheers, Alex
@alexyfyf
A bug was recently introduce that calculated mean quality scores of the input reads incorrectly. This has now been rectified in V2.7.7
Please get in touch again if you have any more problems.
Thanks,
Neil
Hi team,
I found the
-z
length filtering is not working as expected. I have run some data using commandBu the resulting
fulllength.fastq
still contain reads less than 50bp, some even 1 bpAn example read from
fulllength.fastq
file is thisI see the pychopper log also reporting in the statistic tsv file
and stdout log
Both seems suggesting the filter is not working as expected.
Could you have a look at this issue?
Cheers, Alex