brentp / smoove

structural variant calling and genotyping with existing tools, but, smoothly.
Apache License 2.0
231 stars 21 forks source link

smoove paste error; files with different number of variants; how to fix? #168

Closed robertwhbaldwin closed 3 years ago

robertwhbaldwin commented 3 years ago

Hi,

Almost there !!!

I've got 25 files from the genotyping step (...smoove.genotyped.vcf) but only 22 have the same number of variants. I already reran from step 1 to try and fix the problem. Can someone suggest what to do? thanks.

(smoove-env) root@cbd7831c8e25:/data# smoove paste --name RHF *.vcf.gz [smoove] 2021/07/31 13:34:32 starting with version 0.2.6 [smoove] 2021/07/31 13:34:32 squaring 25 files to RHF.smoove.square.vcf.gz [smoove] 2021/07/31 13:34:34 22 files had 123573 variants [smoove] 2021/07/31 13:34:34 files: G0319-20-B0074-joint-smoove.genotyped.vcf.gz,RHF05338-joint-smoove.genotyped.vcf.gz had 123574 variants [smoove] 2021/07/31 13:34:34 files: RHF05350-joint-smoove.genotyped.vcf.gz had 123575 variants [smoove] 2021/07/31 13:34:34 please make sure that all files have the same number of variants

brentp commented 3 years ago

how many variants are in your sites file (the output file from smoove merge ?)

remove any sample files that don't have that number and re-run smoove genotype for those samples.

robertwhbaldwin commented 3 years ago

The merged.sites.vcf had 123573 records. What do you mean by "remove any sample files that don't have that number"? Are you saying that I should rerun the genotype step ONLY for the three samples that did not have 123573 records? thanks - Robert

robertwhbaldwin commented 3 years ago

Thanks. That worked. I had to rerun joint calling twice but eventually they all had the same # variants.