Closed gui11aume closed 5 years ago
Initial tests revealed that choosing before seeding is less efficient mapping first and trying again if the quality is too low. The reasons are as follows:
The empirical workflow that was retained is the following:
The reason for not re-mapping the read when the quality is less than 20 is that the target is probably a strongly repeated sequence and changing the seeding method will not improve the result (the mapping quality will remain low). In Drosophila, approximately 5% of the reads are mapped twice, so the time penalty is low. On the other hand, the benefit is also low.
We can estimate the mapping quality before mapping, so we can switch from the default MEM seeds to the more sensitive skip seeds to maintain the mapping quality above a certain level.
This can be achieved by running the
quality()
function on the read before anything else happens. We may have to change the logic of the function accordingly.