Closed adamnovak closed 4 years ago
The genotyping output seems to have changed for the worse on one test. The test itself isn't super meaningful, but I'm not sure why it would have changed. Will first confirm if it's the de-protofication pr that's at blame. If it's just a small random perturbation in the mapping output then it's no big deal.
The tests run with the current toil-vg master and this vg. Moving to the next vg commit and updating toil-vg to this branch brings the error. Looks like a graph merging issue, as one chromosome is completely absent from the output VCF here, leading to the score drop.
Hm, That's not good at all. I'll have to see if I can replicate this
locally, and if vg combine
is somehow dropping files or if something
else is going on.
Unfortunately I'm supposed to be doing Toil things tomorrow, so I'm not sure how much time I will have for this until Thursday.
On 1/14/20, Glenn Hickey notifications@github.com wrote:
The tests run with the current toil-vg master and this vg. Moving to the next vg commit and updating toil-vg to this branch brings the error. Looks like a graph merging issue, as one chromosome is completely absent from the output VCF here, leading to the score drop.
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/vgteam/toil-vg/pull/786#issuecomment-574396419
I'm taking a look now. It's pretty easy to reproduce. After downloading the input files
toil-vg construct ./js ./os --vcf ./HGSVC_regions.vcf.gz --fasta hg38_chr21_22.fa --regions chr21 chr22 --out_name test --alt_paths --flat_alts --realTimeLogging --realTimeStderr --pangenome --container None --xg_index
gives
os/0.vg
os/1.vg
os/test.xg
Looking for a chr21
variant in 0.vg
looks reasonable
vg find -x os/0.vg -p chr21:5322212 -c 2 | vg view - | grep S
S 166476 ATTCCAGATTCTACAAAAAGAGTGTTGCAAAA
S 166480 GTGCTCTTTTTATTTAAAGATATTTCCTTTTC
S 166475 CACCATAGGCCTCAAAGCCCTCCAAATATCCA
S 166472 CATATATAACAAAGAAGTTTCTCAGAATGTTT
S 166474 GTTCAGTTTTTATGTGAAGATATTTCTTTTTC
S 166473 CT
But in the xg
it's all Ns
vg find -x os/test.xg -p chr21:5322212 -c 2 | vg view -
S 166464 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
S 166463 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
S 166461 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
S 166460 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
S 166462 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
The first issue seems to be that vg ids -j
is not working. os/0.vg
and os/1.vg
have the same node ids. combine
should be throwing an error for this.
In looking at this it also seems that vg stats -r
isn't working either
vg stats -r os/0.vg
node-id-range 1:0
These are pretty fundamental problems. I'm wondering if I (and toil-vg gitlab) are somehow using a vg with out-of-sync dependencies for libbdsg and libhandlegraph.
Combine doesn't have any memory, and if fed a Protobuf vg file it doesn't even parse it, just copies it to standard out.
On 1/15/20, Glenn Hickey notifications@github.com wrote:
Closed #786.
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/vgteam/toil-vg/pull/786#event-2951430832
Hmm, never meant to close this.
This updates toil-vg to work with current master vg and its multiple handle graph output types. It makes sure to use
vg combine
where needed.