bonsai-team / Porechop_ABI

Adapter trimmer for Oxford Nanopore reads using ab initio method
GNU General Public License v3.0
37 stars 4 forks source link

Trimming amplicon primers? #21

Open Thomieh73 opened 5 months ago

Thomieh73 commented 5 months ago

Hi, I am trying to find a tool that can process nanopore data and identify and trim my PCR primers in the dataset?

How would I do that with porechop_abi? is that possible?

qbonenfant commented 5 months ago

Hi, Porechop_ABI should be able to retrieve and trim PCR adapters. In fact, several PCR adapters already exists in the static database imported from the standard Porechop.

There is two ways you may approach this task:

The other options are up to you, but standard parameters should do the trick in most case.

I would suggest a "guess-only" (-go) run before proceeding for quality control and sanity check. It is a lot faster than a full trimming run and will only display potential adapters found in the dataset.

Thomieh73 commented 5 months ago

Hey, thanks for the quick answer.

So I used 16s rRNA primers and they have degenerate bases, like this AGRGTTTGATYHTGGCT.

Do you think Porechop_abi is able to handle that if I use that as custom adapters, or should i rather write out all versions of the primers.

qbonenfant commented 5 months ago

As far as I know, IUPAC sequences are not supported by Porechop's trimming algorithm, which is the one we use. Enumerating all possible variation may slow things down a bit during the adapter selection phase, but should result in a cleaner trimm. You may even discard the static database (-ddb) if you want to speed things up a bit.

I do not have experience with degenerate bases on ONT sequencer, but I think it may be useful to perform a "--guess-only" run anyway. If the sequencer has any kind of bias, or maybe acts strange on such bases, the ABI algorithm will build several consensus sequence that should match the forms that are actually present in your dataset.

Thomieh73 commented 5 months ago

Okay, thanks for the feedback. That is really helpful. I will try it out my dataset.