mourisl / T1K

T1K is a versatile methods to genotype highly polymorphic genes (e.g. KIR, HLA) with bulk or single-cell RNA-seq, WGS or WES data.
MIT License
42 stars 7 forks source link

Homozygous alleles vs missing alleles #15

Closed ck1236 closed 6 months ago

ck1236 commented 11 months ago

Thank you so much for this tool- it was very easy to get it up and running!

I understand that when an individual is homozygous, T1K will return only one allele. However, the documentation also says "In the case of missing allele, the triple (allele, abundance, quality) will be ". 0 -1". We recommend to ignore alleles with quality less or equal to 0."

So for example,I have many individuals where the number of alleles column for a particular locus is 1, and there is one allele listed e.g. A0201. But in the second allele columns there is . 0 -1. Would this individual then be treated as homozygous? Or missing?

Thank you again!

mourisl commented 11 months ago

I guess this is HLA-A gene, and it is expected to be on every chromosome. Therefore, it is homozygous in this case. I will rephrase the readme in future versions. Thank you for pointing this out.

ck1236 commented 11 months ago

Thank you Li, though I do also observe this for the other classical HLA-I (A/B/C) and HLA-II (DRB1/DQB1/DQA1/DPB1/DPA1) loci.

If I am interpreting the documentation correctly, if the number of alleles is 1 at a locus, then the second allele will always be . 0 -1, and the user should set the second allele to equal the first.

However, if the number of alleles at a locus is 2, (thus T1K reports both alleles), and one of the alleles has a quality score of 0, then it should be excluded from the data. Is this correct?

Thank you again Li!

On Thu, Sep 28, 2023 at 10:37 PM Li Song @.***> wrote:

I guess this is HLA-A gene, and it is expected to be on every chromosome. Therefore, it is homozygous in this case. I will rephrase the readme in future versions. Thank you for pointing this out.

— Reply to this email directly, view it on GitHub https://github.com/mourisl/T1K/issues/15#issuecomment-1740233924, or unsubscribe https://github.com/notifications/unsubscribe-auth/BC45JLA2X4HJ5KIYJ6CZLOLX4YX6NANCNFSM6AAAAAA5LWDDJA . You are receiving this because you authored the thread.Message ID: @.***>

mourisl commented 11 months ago

Right. This number is just for the columns with results. It is not for the copy number.

We don't expect to see the loss of these classical genes, so they should all be homozygous.