Open jdamas13 opened 3 months ago
@glennhickey The Cactus-Minigraph pipeline won't leave any absurdly long nodes, right?
I'm not sure how to connect either the error from faster_cap
or the error from vg::DozeuInterface::calculate_max_position
to the fact that some paths in the graph might be very long.
What looks most suspicious is your indexing. You shouldn't need to make genome.gcsa
for Giraffe, and genome.xg
here is not going to be usable with genome.gbz
because they use different node IDs and have different nodes. But it's possible Giraffe is picking it up anyway; it automatically finds any inputs you don't give it based on filename patterns, so that you can just point it at a graph that has all the indexes next to it, and it can use a .xg
file alongside a .gbz
if one is available, to save on reconstructing some path position information.
I would try keeping all the files generated from the pruned/mod-ed graphs in a different directory. You can also run Giraffe with --progress
and it should tell you what it is loading, which might help you figure out if it is loading anything it shouldn't be using.
1. What were you trying to do? Mapping Illumina WGS data to a cactus-minigraph pangenome. In case it is relevant, the pangenome was constructed with 5 marsupial genomes, which have a few long chromosomes (>512Mb) that often cause crashes because of hard-coded limits for some tools.
2. What did you want to happen? Obtain a sam file with the reads mapped to the pangenome.
3. What actually happened? vg crashed with the error posted below.
4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here:Other warnings include:
and multiple similar to this
5. What data and command can the vg dev team use to make the problem happen? Data is not public. Pangenome was built step-by-step with cactus-minigraph, followed by:
vg giraffe ran with:
6. What does running
vg version
say?Any assistance on how to troubleshoot this issue will be much appreciated!