maickrau / GraphAligner

MIT License
256 stars 30 forks source link

Save seed index to disk #49

Open schorlton opened 2 years ago

schorlton commented 2 years ago

Thanks for the great software. Is there a way to save the seed index to disk so it doesn't need to be generated each time?

I see a hidden option for --seeds-mxm-cache-prefix (https://github.com/maickrau/GraphAligner/blob/02c8e2628bba16425dc58cdf67199319f0a7a304/src/AlignerMain.cpp#L87), but is it actually enabled? I don't see any files generated when using this option.

Alternatively, is there another way to cache my seeds to disk?

Thanks!

maickrau commented 2 years ago

That option caches the seed index but only if MUM/MEM seeding index is chosen. The default parameters use a minimizer seeding index, and there is currently no way to cache the minimizer index.

schorlton commented 2 years ago

Got it! Do you have recommended presets for MUM/MEM seeds for aligning long, error-prone reads (up to 15-20% error rate) to a dbg? Any idea if minimizers or precached MUM/MEMs will be faster when aligning to tens of thousands of bacterial genomes?