Closed egoltsman closed 3 years ago
Minigraph ignores letter cases, so this should not be the cause. Minigraph may take a lot for memory for repeat-rich genomes.
This is, if fact, a large repeat-rich plant genome. It's assembled into complete chromosomes, one for each haplotype, so I was going to build a separate graph for each chromosome and then merge the graphs. In order to merge rGFAs, is it enough to just give the segments unique ids, or is there anything else to watch out for? For example, is it OK to keep the SN and SO tags redundant for the different chromosomes/scaffolds?
It is preferred to build a graph in one go. You can use a smaller -U to reduce the memory.
Thanks, do you have a recommendation on a reasonable value to try here? I tried it with -U 25,100 and again an out of memory on a 1TB machine.
What is the size of the genome?
Ah, my mistake. It didn't run out of memory. It actually completed! Thanks for the tip!
Great to know!
Hello Heng, My minigraph run is crashing on the construction of a 2-genome graph. In fact, it's my server that's killing it, which suggests a memory overrun. I've been able to build much larger graphs in the past (i.e. larger in terms of total sequence length), so I'm scratching my head here. The only odd thing about the sequence is that it has mixed case letters, so I wanted to double check that the program supports this.
Thanks!