Hello,
We are trying to understand minigraph behavior.
We built a minigraph graph with 6 simulated assemblies (only SVs > 100bp, no SNPs, simulated with VISOR) and saw that we ended up with very few extremely large nodes and many SVs we used to make simulations did not end up in the graph despite being longer than 100bp. We only ended up with less than 10 SVs per chromosome where we used on average more than 3700 SVs per chromosome to make the simulations. We verified that the simulated data was not the issue because the data works well with other pangenome graph building pipelines (Minigraph-Cactus and PGGB). Also minigraph works perfectly well with real-world data, so something is going on with the simulated data and we do not know why. Could you perhaps help explain this?
Hello, We are trying to understand minigraph behavior.
We built a minigraph graph with 6 simulated assemblies (only SVs > 100bp, no SNPs, simulated with VISOR) and saw that we ended up with very few extremely large nodes and many SVs we used to make simulations did not end up in the graph despite being longer than 100bp. We only ended up with less than 10 SVs per chromosome where we used on average more than 3700 SVs per chromosome to make the simulations. We verified that the simulated data was not the issue because the data works well with other pangenome graph building pipelines (Minigraph-Cactus and PGGB). Also minigraph works perfectly well with real-world data, so something is going on with the simulated data and we do not know why. Could you perhaps help explain this?
It looks like others may have faced a similar problem? https://github.com/lh3/minigraph/issues/62