DaehwanKimLab / hisat2

Graph-based alignment (Hierarchical Graph FM index)
GNU General Public License v3.0
464 stars 112 forks source link

`--score-min` do not work correctly for hisat2-3n #351

Closed y9c closed 2 years ago

y9c commented 2 years ago

I change the cutoff for hisat2-3n into --score-min L,0.0,-0.08 --mp 6,4 --base-change C,T --no-spliced-alignment, and test the alignment with a short read:

@read1
GCAGGGAAAAAGAGGAGG 
+
AA/AAE///</E/A/<EE 

The read should not be aligned into the reference, and the alignment score (-48) is less than the cutoff. However, hisat2-3n still report this read in the result.

GCAGGGAAAAAGAGGAGG
 ||   .|||| || |
ACAAAAGAAAAAAGAAAA
read1       0       lambda  39136   60      18M     *       0       0       GCAGGGAAAAAGAGGAGG      AA/AAE///</E/A/<EE  AS:i:-48        NH:i:1  XM:i:8  NM:i:8  MD:Z:0A2A0A0A0G4A2A1A0A0        YZ:A:-      Yf:i:1  Zf:i:1  XN:i:0  XO:i:0  XG:i:0

In this test, I choose lambda DNA ( https://www.ncbi.nlm.nih.gov/nuccore/J02459.1) as a reference. Could you check this for me?

Thank you for your help.

y9c commented 2 years ago

I realized this bug might come the design of hisat2-3n. The alignment parameters are applied to the tools before calculating base -change. And hisat2-3n can not distinguish asymmetric conversion (C->T and T->C is different, BSMAP tool can distinguish this). So some mismatch can not be filter out...

imzhangyun commented 2 years ago

@y9c

I am sorry for this bug. I just changed the HISAT-3N code. Now HISAT-3N will filter out the alignment with low score. Please check that on the hisat-3n_ScoreBugFixing branch. If everything is OK, I will merge it to hisat-3n branch tomorrow.

y9c commented 2 years ago

Hi Leo,

I read the code you patched just now, and it seem that hisat2-3n will recalculate the score after mapping. Am I correct? Will the tag also be updated?

Chang

On Thu, Feb 17, 2022, 16:27 Yun (Leo) Zhang @.***> wrote:

@y9c https://github.com/y9c

I am sorry for this bug. I just changed the HISAT-3N code. Now HISAT-3N will filter out the alignment with low score. Please check that on the hisat-3n_ScoreBugFixing branch. If everything is OK, I will merge it to hisat-3n branch tomorrow.

— Reply to this email directly, view it on GitHub https://github.com/DaehwanKimLab/hisat2/issues/351#issuecomment-1043543100, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJKEVXWT64V5MXMGYRLMHLU3VY3HANCNFSM5OVZVMGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

imzhangyun commented 2 years ago

@y9c

Yes, the AS tag will be updated after the 3n-mapping. Actually the original hisat-3n also updated the AS score, but it did not filter out the alignment with low score. That is why hisat-3n output the alignment with AS:i:-48 I am sorry about this mistake.

y9c commented 2 years ago

Got it. If so will this filtered read being recorded by -un file?

Chang

On Thu, Feb 17, 2022, 17:09 Yun (Leo) Zhang @.***> wrote:

@y9c https://github.com/y9c

Yes, the AS tag will be updated after the 3n-mapping. Actually the original hisat-3n also updated the AS score, but it did not filter out the alignment with low score. That is why hisat-3n output the alignment with AS:i:-48 I am sorry about this mistake.

— Reply to this email directly, view it on GitHub https://github.com/DaehwanKimLab/hisat2/issues/351#issuecomment-1043595812, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJKEVWZDCJKVBXCZSRG74LU3V52FANCNFSM5OVZVMGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

imzhangyun commented 2 years ago

Yes, if the alignment has very low score, hisat-3n will output it as unaligned reads. It should be recored by -un file.

y9c commented 2 years ago

hisat-3n_ScoreBugFixing brach works. Thank you!