SACGF / variantgrid

VariantGrid public repo
Other
23 stars 2 forks source link

Variant not in sample (VarDict VCF issue?) #1053

Closed davmlaw closed 2 weeks ago

davmlaw commented 4 months ago

I looked at a variant - https://variantgrid.com/variantopedia/view_variant/242383859 in 210628 VCF in an analysis

If you look at the variant, it doesn't list it as being in any samples (though should be in that one)

global zygosity count did run on that VCF, but it didn't seem to affect anything

If you classify, then select the sample, it says it's not in there.

davmlaw commented 3 months ago

Example in test:

In [5]: Variant.objects.get(pk=1286084).cohortgenotype_set.all().count()
Out[5]: 2

Checked out why vsi.has_observations is False saw a warning while initialising:

In [15]: vsi = VariantSampleInformation(user, v, GenomeBuild.grch37())
WARNING CohortGenotypeCollection 27 out of date with Cohort.version
WARNING CohortGenotypeCollection 53 out of date with Cohort.version

I think this is a test to deal with cohorts being modified - but that shouldn't apply here as that should never change as it's from a VCF

The question is why the CohortGenotypeCollection.cohort_version is out of date

I think it may be due to the "common" one being different? It is v0 while cohort.version = 1 on vgaws sample in analysis given in issue above

I will disable the check for versions at the moment - will figure out the details later

davmlaw commented 3 months ago

In method create_cohort_genotype_collection - creating CohortGenotypeCollection we deliberately set the common one to 0 and comment:

 cohort_version=0,  # so it isn't retrieved

Then I looked on the model:

 unique_together = ('cohort', 'cohort_version')

So it looks like we're doing it as a workaround to avoid the unique_together?

davmlaw commented 3 months ago

testing vg test - samples with variant appear now