GaetanBenoitDev / metaMDBG

MetaMDBG: a lightweight assembler for long and accurate metagenomics reads.
MIT License
105 stars 4 forks source link

Changing parameters for metaMDBG polish #11

Closed jadeaver closed 4 months ago

jadeaver commented 4 months ago

I was trying to use Racon but encountering memory issues (even on my University HPC) when I came across metaMDBG and its stand alone polisher. It's a great tool and I have already gotten it to run successfully without the memory issues I was encountering with Racon - awesome!! Thank you!

My questions -

1) I am polishing an ONT long read assembly with Illumina short reads. I noticed however that this polisher uses the minimap2 parameter -x map-hifi. I installed metaMDBG from source (using conda) and edited the file ContigPolisher.hpp to replace -x map-hifi with -x map-ont. When I run the polisher, the log file still indicates the minimap2 is used with -x map-hifi. I am wondering if I edited the right script or if there is another script I would need to edit to change this parameter?

2) I am also noticing I have the same number of contigs after polishing, but ~40% fewer basepairs and fewer long contigs (see below). I know this can be normal with Racon because the unpolished contigs aren't returned, but Racon has the -u option to include unpolished. Is there a similar flag for this polisher?

contigs length | before | after contigs (>= 0 bp) | 35804 | 35798 contigs (>= 1000 bp) | 35718 | 35442 contigs (>= 5000 bp) | 34601 | 27176 contigs (>= 10000 bp) | 25810 | 16420 contigs (>= 25000 bp) | 10904 | 5982 contigs (>= 50000 bp) | 4117 | 1705 Total length (>= 0 bp) | 1.02E+09 | 5.67E+08

GaetanBenoitDev commented 4 months ago

The polisher options are a bit limited, and I won't be able to update it soon unfortunately. Editing the minimap2 command should work, if you compile from source, you need to use this executable then: ./bin/metaMDBG in the build folder.

If you polish with short reads, you should use option -x sr of minimap2 instead, but just note that the polisher won't handle short reads optimally (because it won't use the paired information of short reads).

jadeaver commented 4 months ago

Makes sense - thank you!