bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
986 stars 354 forks source link

HLA regions - alignment issues #1552

Closed deber1980 closed 8 years ago

deber1980 commented 8 years ago

Hello,

I'm running an analysis where some of the regions are HLA regions that have multiple haplotypes of chr6. The reference used is hg19. Unfortunately, in these regions I get an alignment with MQ=0. AFAIK the best approach would be to move to hg38 and use bwa-kit as implemented here https://github.com/chapmanb/bcbio-nextgen/blob/master/bcbio/hla/bwakit.py

Is there a workaround for using hg19? One thought I had was using a no-alt reference of hg19 until we would switch to hg38. Is there an out-of-the-box option to do so? Are there other alternatives I could use in order to align only to the main haplotype of chr6?

Thank you for your help

chapmanb commented 8 years ago

Thanks for the HLA questions. This is fully supported in hg38 and you can specify hlacaller: optitype in the configuration to get correct alignments to all of the alternatives plus downstream calls of the A,B and C types:

http://bcbio-nextgen.readthedocs.io/en/latest/contents/configuration.html#hla-typing

I don't know of a way to do this cleanly on hg19. Remove alts won't help much since you need these to correctly resolve alignments to these regions since they're so polymorphic. The best bet if you can't run hg38 to get the HLA types is to try using OptiType (https://github.com/FRED-2/OptiType) outside of bcbio by feeding it the full BAM file. Sorry to not have a build in solution for hg19 but hope this helps.