soedinglab / plass

sensitive and precise assembly of short sequencing reads
https://plass.mmseqs.com
GNU General Public License v3.0
132 stars 14 forks source link

Quality trimming reads? #44

Open timghaly opened 5 months ago

timghaly commented 5 months ago

Dear Plass team,

I am very interested in using this tool for protein assembly of soil metagnomes. I am just curious if you would recommend to first quality filter and trim reads, e.g., using fastp. Will this improve the precision of Plass, or will the potential reduction in read length from the trimming come at too great a cost in sensitivity? What would you recommend?

Best, Tim

FlyinTeller commented 3 months ago

Quality Trimming is a good idea for the assembly process. We haven't done comparative tests of assembly with and without quality trimming. But if you don't quality trim, it means that you would get a much lower sequence identity in the overlap between reads. This in turn could only be counteracted by lowering the threshold for sequence identity, which could have a negative impact on precision. There might be a small reduction in sensitivity when the overlap between reads becomes less than the length of a kmer, but this should be a small effect and the loss in precision you would get from not quality trimming probably outweighs this effect.

timghaly commented 3 months ago

Great, thanks for your advice!