Open Sunhh opened 2 years ago
It sounds like you may be asking vg prune
to do a lot of work. I think it did the pruning away of pathologically complex regions, and that split your graph into 408,339 different pieces, and now it is trying to string them back together by filling in the gaps with material from named paths (I think).
I wouldn't be surprised if it took a day or two and a lot of memory to do this, at whole genome scale.
@glennhickey When you do vg prune
pruning on Cactus/Minigraph graphs, how long do you usually have to wait?
408,339 different pieces really is a lot, though. How confident are you that the alignments going into this are good and reflective of evolutionary history at a consistent age, and not pathologically complicated and collapsing paralogs together?
1. What were you trying to do? I want to build GCSA index for a VG file generated from the Minigraph-Cactus Pangenome Pipeline. Because I met a "Size limit exceeded" problem in "vg index" run, I tried to prune this graph before the indexing.
2. What did you want to happen? I wanted to simplify the variation graph for GCSA indexing.
3. What actually happened? The "vg prune" has been running for over 11 hours after throwing out a message of "Complement graph: 2367489 nodes, 2325013 edges in 408339 components" and is still running.
4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here: None.5. What data and command can the vg dev team use to make the problem happen? After executing "cactus-graphmap-join", I got this problem.
6. What does running
vg version
say?Thank you!