Open ekg opened 5 years ago
Not just new users - I think I've run into this with some of the structural variant graphs I'm building.
My vote would be for the default pruning level. API wise, it'd be nice to be able to turn it off with a single flag or modify it with a single delimited arg (like we do for the insert size distribution in map).
The default pruning levels will introduce a new kind of problem: over-pruning. And it will be hard for users and people evaluating the model to understand what's going on. This is hard to get right.
When building the GCSA2 index, new users are caught 100% of the time by the space explosion of kmer enumeration.
vg prune
is basically a requirement. At very least, we should have a pass that checks the graph complexity and estimates good pruning parameters. Second best, and simpler for users, we should try to do the pruning automatically so that the indexing doesn't blow up.