artic-network / fieldbioinformatics

The ARTIC field bioinformatics pipeline
MIT License
110 stars 69 forks source link

Fix align_trim:find_primer #118

Open boospooky opened 1 year ago

boospooky commented 1 year ago

This PR fixes find_primers to search in the appropriate direction for left or right primers. In particular, this fix helps when using the Rapid Barcoding Kit. More reads are assigned the correct primer pair and group number, leading to more reads being used in consensus sequence generation.

The function find_primer searches, in both 5' and 3' directions, for the nearest primer to a position on a given strand. This causes issues with using the Nanopore Rapid Barcoding Kit, which makes reads that begin in the middle of amplicons. These reads are assigned the wrong primer pairs and are filtered out.

The correction is to add a condition to the list comprehension and (p['start'] - pos <= 0).

BioWilko commented 10 months ago

We currently expressly don't support rapid barcoding, this fix would indeed force align_trim to assign fragmented reads to the correct amplicons. What it would also do is break normalisation in the case of fragmented data meaning that it is an incomplete fix.