polio-nanopore / piranha

GNU General Public License v3.0
17 stars 4 forks source link

Similar database sequences leading to ambiguous mappings #219

Closed AShaw1802 closed 1 month ago

AShaw1802 commented 8 months ago

I think we may be seeing a case in Pakistan where wt1 reads are mapping to sequences too similar in the reference database, so are being assigned to ambiguous mapping (I'm trying to get the raw data now to share). Is there a measure of how similar the reference sequences can be before it's an issue? We can screen the current database, but people may add their own sequences in the future- is there a way that Piranha could cope with similar references?

aineniamh commented 7 months ago

Notes:

Current approach:

Plan

We propse to change the mapping parsing steps as such:

aineniamh commented 7 months ago

https://github.com/polio-nanopore/piranha/pull/222

Dev work on this issue continues- more permissive paf parsing will now raise issue of requiring masking of regions that do not have good coverage because of mapping failure.

aineniamh commented 1 month ago

This is now resolved on main.