Closed sahasra-shankar closed 2 years ago
It could be that the contig names don't match between the VCF and the FASTA. Can you check to make sure that they do?
How would I go about checking that?
The VCF contigs should be listed in the header of the file. You can see them with this command:
bcftools view -h variants.vcf.gz | grep contig
The FASTA contigs are the sequence names, which you can pull out with this command:
grep ">" ref.fasta
I was not able to check contigs with the command you suggested because I don't seem to have bcftools, however I was able to take a look at the example VCF file in vg and realized that my VCF file does not look exactly like that (missing list of "##contig=..."). Is there a way to edit my current VCF so that it matches the example? Or are there resources to get a VCF that looks like the example for the virus I am interested in. I have attached my VCF below: SNP-2014.vcf.gz
I would strongly recommend installing bcftools if you plan to work with VCF files -- it's really an essential tool in bioinformatics.
That said, if you're header doesn't have contig lines (they are optional, so it's not strictly speaking an error), then you could retrieve the contig sequences with this command (assuming the variants are grouped by contig):
zcat variants.vcf.gz | grep -v "#" | cut -f 1 | uniq
I see, I fixed the problem but also was wondering if there is a way to find VCF files online that uses other virus variants as a reference as I only found a VCF using one specific variant as the reference. Because there are two variants I am interested in, I am hoping to find a VCF using the other variants as a reference as well. Are there resources that have VCF files of various viral variants, or would I have to generate my own VCF file? In the case that I have to create my own, is there a tool I could use to do this?
I'm afraid I can't be much help there. I'm only particularly familiar with the variant data resources for humans. I'm going to close this issue since the VG error has been resolved.
i also meet this problem, could u tell me how to solve it? Thainks
1. What were you trying to do? Run vg construct on the reference and the variants.
2. What did you want to happen? Successfully build the graph.vg
3. What actually happened?
4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here:5. What data and command can the vg dev team use to make the problem happen?
6. What does running
vg version
say?