Closed ysard closed 3 years ago
During the VCF parsing, I split multi allele into many line. So:
become:
Human is diploid : Two chromosom of each type. So the following variant :
chr2424 A T
can be : A / A => ( genotype = 0 / wild homozygous ) A / T => ( genotype = 1 / heterozygous ) T /T => ( genotype = 2 / muted homozygous )
In some case, you don't have the genotype. So it should be : -1
Phasing mean you know if your variant are on the same same chromosome or not . For instance, you have 3 heterozygous variant :
chr2 A/T chr2 C/G chr2 G/C
If variants are not phased, you don't know which chromosom has the variant : It can be :
or
When variants are phased, you know :
chr2 T | T chr2 C | G chr2 G | C
T----C---G A----G----C
Actually... I have an idea how to display this info to the user. But it is for a next release.
Dr Sacha Schutz médecine / génétique moléculaire Bioinformatique dridk.me Fork me on github
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ Le vendredi 30 octobre 2020 05:28, ysard notifications@github.com a écrit :
We expect only 4 values from gt field: -1, 0, 1, 2
According to the discussion/monologue in #182 , if I read correctly the doc of VCF, this field can be composite for multiple alleles with separators like | or / (I do not really understand the difference between phased and unphased genotypes defined by these characters BTW :( ).
I thought that it had to do with the heterozygous / homozygous definitions but a priori, it is not the case because we only take into account 1 number and only display an icon by defaut if several alleles are indicated?
Am I right ?
Can you valid the signification of the current icons (for future tooltips):
- -1: Unknown genotype
- 0: Reference allele (in REF field)
- 1: First allele listed in ALT
- 2: Second allele listed in ALT
Secondarily, such icons are displayed in formatters and in variants_info plugin. And obviously the implementations are not the same...
plugin: icon = self.GENOTYPES.get(genotype, self.GENOTYPES["-1"])
formatter: icon = self.GENOTYPE_ICONS.get(int(value), self.GENOTYPE_ICONS[0])
=> 2 different default icons...
How to solve issues in this important feature?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Ok thank you for the precision, it's clear for all the points now. I had completely missed this part of VCFReader.
It is now documented and I fixed the default genotype icon; and added tooltips and description for this field.
I let it opened for this thing about phasing but feel free to close it.
We expect only 4 values from gt field: -1, 0, 1, 2
According to the discussion/monologue in #182 , if I read correctly the doc of VCF, this field can be composite for multiple alleles with separators like | or / (I do not really understand the difference between phased and unphased genotypes defined by these characters BTW :( ).
I thought that it had to do with the heterozygous / homozygous definitions but a priori, it is not the case because we only take into account 1 number and only display an icon by defaut if several alleles are indicated?
Am I right ?
Can you valid the signification of the current icons (for future tooltips):
Secondarily, such icons are displayed in formatters and in variants_info plugin. And obviously the implementations are not the same...
=> 2 different default icons...
How to solve issues in this important feature?