Open peterdfields opened 1 month ago
This seems like an incompatibility between your gam and graph. You can verify using
vg validate <graph> -a <gam>
Possible fixes are to use the .gfa
file that you used for GraphAligner
for surject/pack
. Or to add --vg-algorithm
to the vg convert
command used to make that gfa in the first place.
Running vg validate
on the graph and gam file does identify many Invalid Alignment:
instances (I can send specific instances if that's helpful). On the tutorial linked to above the suggestion for preparing the graph for GraphAligner
is:
bash -c "vg convert ./hprc10/hprc10.gbz -f > ./hprc10/hprc10.gbz.gfa"
Should the correct command then to create a vg surject/pack
.gam
be:
bash -c "vg convert --vg-algorithm ./hprc10/hprc10.gbz -f > ./hprc10/hprc10.gbz.gfa"
When I tried using the .gfa
produced with the first command above and used for the GraphAligner
alignment step it did seem to be running but after waiting a few hours it hadn't produced any output. This is a low-coverage test dataset so I wouldn't have expected such a long run time but perhaps I killed the job too soon?
Yeah, I think that's a bug in the tutorial. I've fixed it here (which will be merged at some point into the master branch) to add the --vg-algorithm
. The auto node name conversion, though well intentioned, is extremely confusing.
Anyway, running GraphAligner
on the graph converted with that flag should produce a GAM that passes the vg validate -a
check with the input gbz
(please let me know if it doesn't!). On the other hand, using / not using --vg-algorithm
should have very little bearing on the GraphAligner performance, mapping rate etc.
Following you're suggestions I am able to successfully run the vg validation
, vg surject
, and vg pack
! I do still see some warnings from GraphAligner
that I need to investigate further on individual reads, such as the following:
src/GraphAlignerBitvectorCommon.h:560: Assertion 'traceEnd >= traceStart' failed. Read: fed8cb24-e303-4460-a55e-f6b1c1f653f9. Seed: 0+,0,0,0
src/GraphAlignerBitvectorCommon.h:560: Assertion 'traceEnd >= traceStart' failed. Read: 7944e6cc-ccdd-4062-8604-20675227310a. Seed: 0+,0,0,0
src/GraphAlignerBitvectorCommon.h:560: Assertion 'traceEnd >= traceStart' failed. Read: 400c9730-d168-41c8-8664-f9a060591c85. Seed: 0+,0,0,0
src/GraphAligner.h:685: Assertion 'trace.trace[i].nodeSwitch || trace.trace[i].DPposition.node != trace.trace[i+1].DPposition.node || trace.trace[i].DPposition.nodeOffset != trace.trace[i+1].DPposition.nodeOffset' failed. Read: 2a074333-1c4f-49a7-8471-145923a166b2. Seed: 0+,0,0,0
1. What were you trying to do?
I would like to use VG to work with long read data aligned to a pangenome graph generated using cactus-minigraph. I've been following the steps described here, including using
vg convert
andGraphAligner
to create the.gfa
index and align the ONT data, respectively. Now I am trying to apply VG tosurject
a gam -> bam as well as do SV variant calling using VG on the gam.2. What did you want to happen?
To create a
.bam
from the.gam
and run thevg pack
step (including the use of-Q0
) in preparation for SV calling.3. What actually happened?
In both cases I see an error or segfault.
4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here:In the case of
vg surject
I see:In the case of
vg pack
I see5. What data and command can the vg dev team use to make the problem happen?
Depending upon what the possible error might be I can figure what data can be shared related to the graph and reads.
6. What does running
vg version
say?