Closed ekg closed 3 years ago
vg gbwt
chops your nodes down to 1024bp, so your gbwt will not be id-compatible with your original graph.
I recommend:
Note that in the linked example, a translation is used so that the node-ids in the VCF refer to the original gfa (and not the chopped graph).
Thanks Glenn, I'll try out this workflow. I'm surprised that vg gbwt
is doing the chopping!
1. What were you trying to do?
I was attempting to make phased VCFs from pggb graphs made for the HPRC.
Let's take chr22 as a test case: https://s3-us-west-2.amazonaws.com/human-pangenomics/pangenomes/scratch/2021_07_30_pggb/chroms/chr22.pan.fa.a2fb268.e820cd3.9ea71d8.smooth.gfa.gz
I used this script:
Called this way:
vcf_extract_phased.sh chr22.pan.fa.a2fb268.e820cd3.9ea71d8.smooth.gfa.gz chm13 48
2. What did you want to happen?
I wanted to obtain a fully-phased VCF representing the chromosome graph.
3. What actually happened?
4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here:I'll re-run to generate this.
5. What data and command can the vg dev team use to make the problem happen?
See above.
6. What does running
vg version
say?