griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
144 stars 59 forks source link

Feature: Auto-generate MHC class II combinations #528

Closed chrisamiller closed 2 years ago

chrisamiller commented 4 years ago

Many of the supported class II alleles are combinations of A and B alleles. Example:

DQA1*01:01-DQB1*02:01

Typical HLA typing protocols only return a list of individual alleles, not the combos:

DQA1*01:01
DQA1*01:03
DQB1*05:01
DQB1*06:03

It would be nice if pvacseq had an option to automatically combine the A and B alleles, producing the list:

DQA1*01:01
DQA1*01:03
DQB1*05:01
DQB1*06:03
DQA1*01:01-DQB1*05:01
DQA1*01:01-DQB1*06:03
DQA1*01:03-DQB1*05:01
DQA1*01:03-DQB1*06:03
malachig commented 4 years ago

Yeah this would be nice.

Handy summary of class II pairings:

class II pairings

CJBgon commented 4 years ago

I've written a quick python script that can be called in bash to get those unique combinations:

https://github.com/CJBgon/MHC_comb

As of now it only takes a particular .json input. I'll make it more versatile in the coming week.

malachig commented 4 years ago

Nice, thanks for sharing.

boyangzhao commented 3 years ago

Just came across this issue, I'm sharing a script in case it's helpful. I've actually adopted your hla_consensus.cwl workflow for our use, but have since modified it to accommodate multiple HLA callers (not just optitype) and generate class II combos.

I've pushed it here: https://github.com/boyangzhao/hla_consensus

susannasiebert commented 2 years ago

Handy summary of class II pairings:

class II pairings

A couple of questions @malachi:

malachig commented 2 years ago

@susannasiebert . Trying to address these questions. For the question on seemingly missing HLA-DRB2. It does exist but its a pseudogene so does not get used (there is also 6,7,8 and 9, all of which also ignored here): https://www.nature.com/articles/jhg20085

The HLA-DR haplotypes consist of a number of copies of coding and non-coding HLA-DR genes. The expressed DRB sequences have been assigned to four different loci, DRB1, 3, 4 and 5. The highly polymorphic DRB1 alleles (Table 4) are present in all haplotypes, whereas DRB3, 4 and 5 are present only in some haplotypes, as are the HLA-DRB2 and HLA-DRB6 to -DRB9 pseudogenes. The HLA-DRB2 pseudogene lacks exon 2 and contains a 20-nt deletion in exon 3, which has interrupted the correct translational reading frame.

From wikipedia:

HLA-DRA encodes the alpha subunit of HLA-DR. Unlike the alpha chains of other Human MHC class II molecules, the alpha subunit is practically invariable. However it can pair with, in any individual, the beta chain from 3 different DR beta loci, DRB1, and two of any DRB3, DRB4, or DRB5 alleles. Thus there is the potential that any given individual can form 4 different HLA-DR isoforms (2 alleles of DRB1 and two alleles from DRB3, DRB4 or DRB5).

malachig commented 2 years ago

On the question of HLA-DQ pairing/dimerization. Yes, I think that is probably the conventional wisdom.

That we are primarily interested in the combinations of allele pairs of: HLA-DQA1 - HLA-DQB1

If a patient is heterozygous for DQA1 and DQB1, both alleles of DQA1 can pair with both alleles by DQB2.

It is much less clear what is the importance of HLA-DQA2 - HLA-DQB2. And I think it is generally accepted that you wouldn't consider pairing of DQ 1s with 2s.
Though I did find a paper that discussed them: https://www.jimmunol.org/content/188/8/3903

Note that IEDB's online MHC binding resource does NOT appear to support any DQA2/DQB2 alleles for any prediction method.

Good overall reference for DQA2 https://www.jimmunol.org/content/jimmunol/158/5/2116.full.pdf