In the recent release of 500,000 genomes, the UKB has provided SV calls, but only in bgzipped sample-level vcf files.
I've tried merging these files in groups to create a pVCF- after unzipping each vcf, as survivor doesn't seem to take .gz files? - but the file size is growing such that I can't merge those groups (I get a "Killed" error). I tried trimming the vcf files to just genotypes in the FORMAT field using bcftools - but then the merging was odd, in that when merging two files with 9000 people each in, I got only 2 individuals in the output
Do you have any suggestions for how I could perform this analysis?
Hi,
In the recent release of 500,000 genomes, the UKB has provided SV calls, but only in bgzipped sample-level vcf files.
I've tried merging these files in groups to create a pVCF- after unzipping each vcf, as survivor doesn't seem to take .gz files? - but the file size is growing such that I can't merge those groups (I get a "Killed" error). I tried trimming the vcf files to just genotypes in the FORMAT field using bcftools - but then the merging was odd, in that when merging two files with 9000 people each in, I got only 2 individuals in the output
Do you have any suggestions for how I could perform this analysis?
Cheers, Gareth