Closed nh13 closed 7 years ago
Look at the ##contig
lines in your VCF, they make no mention of chromosome 19. What was the command you used to create the VCF (presumably from dwgsim
)? Did you also use ../hg19/chr19.fa as your reference?
I added the "chr19" conting to the header. i get the error "Error: contig not found [PASS]" now. Yes i gave it only chr19 reference as i'm testing only against chr19 variants. I tried it also with the whole reference genome with all the chromosomes but i still get the same error (and again if i take a smaller number of variants it does work). I used a standard vcf file taken from GIAB and extracted only chr19 variants, but i got an error similar what's described here. So I removed the info columns and put instead the info column supplied in the example files of dwgsim as well at the header, to make sure it wasn't caused but some wrong info. header looks like that now:
attached is the file exmp.vcf which doesn't work. the same file without the last 500 variants that does work, exmp_short.vcf. the orig files without ~500 first variants (but with the last 500 variants) that does work exmp_no_first_lines.vcf exmp_no_first_lines.vcf.zip exmp_short.vcf.zip
Rather than trying to modifying the VCF to make it work, can you tell me how you created the VCF in the first place? I suspect with dwgsim using some custom reference? The contig lines in the example VCF do not match chr19. Let me know how you creat d the VCF.
I took it from GIAB for NA12878, and add it "chr" at the beginning (#CHR column) using awk.
You definitely need the proper contig lines. Try simulating a few reads without an input VCF and look at the format of the output VCF.
Thanks but yes, I changed it to have a proper contig lines. I'm attaching again the vcf header of the exmp.vcf file:
and this is the vcf output header when running without vcf variants input file:
They look identical, so I don't think this is the issue. I also try to use the output vcf as an input and again i got kind of a similar error: "Warning: strand of the mutation not found; please use the 'pl' tag. Error: contig not found [RT]"
I think there's some memory allocation problem or something similar when supplying a variant file as an input.
Ok, I'll take a look, but it won't be for a few weeks as I am taking time off for the birth of my second kid. My apologies for the delay.
Okay thanks for the help and update, and congratulations :)