vgteam / sv-genotyping-paper

MIT License
31 stars 6 forks source link

Truth baseline vcf #13

Open Jordi-V opened 4 years ago

Jordi-V commented 4 years ago

Hi,

my name is Jordi and I want to use the file svpop-truth-baseline.vcf.gz from https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=vgsv2019/vcfs/

But I dont know how it has been done.... From where you download this data?? I'm asking about that, because you put the genotypes of each sample, but I'm not able to find the same file in the audano paper... they send us to a FTP, where all variants are detected, but without genotype.

So my question is about how do you do this file with genotypes??? Or Where you get it??

Thanks for your help ant time

Jordi

Jordi-V commented 4 years ago

Sorry for another question... It is correct that all variants reported in this file are heterozygous? 0/1 Because I make a fast review of the data, and all genotypes are 0/1....

my apologise

Jordi

glennhickey commented 4 years ago

The VCF is made as described at the top here: https://github.com/vgteam/sv-genotyping-paper/tree/master/human/svpop

There is only presence/absence information for these variants. We used the convention of 0/0 genotype = present , 0/1 genotype = absent. These aren't real genotypes, but they allowed us to use our comparison pipeline on the VCFs directly. You'll note in the paper that there is no genotype comparison for this dataset, only presence/absence.

glennhickey commented 4 years ago

Oops, I meant 0/0=absent 0/1=present!

xiaoguizz commented 2 years ago

when I run 'vcfkeepinfo EEE_SV-Pop_1.ALL.sites.20181204.vcf.gz SVTYPE | vcffixup - | bgzip > EEE_SV-Pop_1.ALL.sites.20181204.fix.vcf.gz',I can't get the corresponding answer. Is this a command