mourisl / T1K

T1K is a versatile methods to genotype highly polymorphic genes (e.g. KIR, HLA) with bulk or single-cell RNA-seq, WGS or WES data.
MIT License
42 stars 7 forks source link

WGS vs RNA-seq #36

Open naqvia opened 1 month ago

naqvia commented 1 month ago

We are utilizing T1K for HLA subtyping, but we had a few questions. For some of our samples, we have both WGS and RNA-seq, so in this case, is there a preference on what we should use? At first, we assumed the output would be identical or close to it, but that was not the case. When we ran the tool on WGS and RNA for the same samples, we observed some discrepancies including number of alleles, quality/abundance scores, and the fact that there are two extra HLAs (HLA-DRB2 and HLA-DRB7) in the RNA-seq run. In this case, which run should we trust more/prioritize? Any sort of guidance would be greatly appreciated!

mourisl commented 1 month ago

In my experience, for the classical HLA genes, like -A,-B,-C, -DQB1, -DQB1, -DRB1, the RNA-seq data is more accurate. This could be because that they have high expression, and there is no intron in RNA-seq data so the genotyping is easier. For the pseudo-gene, I'm not sure which one works better. In theory, WGS is more robust for them, as the pseudo-gene may have low expression in the RNA-seq data.

Hope this helps.