OpenGene / fastp

An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
MIT License
1.93k stars 334 forks source link

Trim Nextera mate-pair adapters #82

Open sjackman opened 6 years ago

sjackman commented 6 years ago

Is fastp able to identify and trim Nextera mate-pair adapters that occur in the middle of a read? I currently use https://github.com/sequencing/NxTrim

sfchen commented 6 years ago

Surely fastp can do this.

You can upload some sample data (1000 lines are enough) here, so that I can have a test.

sjackman commented 6 years ago

Here's an assembly of some public Drosophila data: https://github.com/sjackman/abyss-drosophila-melanogaster Here are the mate-pair reads: https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR3663860 Here's the command that I used to trim the mate pair reads: https://github.com/sjackman/abyss-drosophila-melanogaster/blob/52cfc55bbde0f435c48ca2f604180b233739aba5/Makefile#L149

Are you familiar with Nextera mate-pair reads? When the adapter occurs in the middle of a read, you effectively end up with three reads, which create a forward-reverse (paired-end-like) read pair and a reverse-forward (mate-pair-like) read pair. My preference is to keep the reverse-forward read pair and discard the forward-reverse read pair.

Here's a related discussion with the author of NxTrim: https://github.com/sequencing/NxTrim/issues/50

kokyriakidis commented 5 years ago

@sjackman Do you keep using NxTrim for Nextera mate pair, or you found a better alternative? I want to use some mate pair libraries with abyss!

sjackman commented 5 years ago

I use NxTrim.

sfchen commented 5 years ago

Plan to support Nextera mate pair soon.