bonsai-team / matam

Mapping-Assisted Targeted-Assembly for Metagenomics
GNU Affero General Public License v3.0
19 stars 9 forks source link

MATAM on conda uses biocore/sortmerna instead of ppericard/sortmerna which makes it slower #85

Open ppericard opened 4 years ago

ppericard commented 4 years ago

ppericard/sortmerna was forked from biocore/sortmerna and modified to change the OpenMP parallel schedule in order to greatly improve sortmerna time during MATAM scaffolding.

However, in the conda recipe, MATAM uses the biocore/sortmerna version which could make it much slower.

We should investigate whether this modification is still valid, especially since newer GCC compilers might already vastly improve sortmerna 2.1 running time. This should also be revisited when SortMeRNA 3 or 4 will be available on bioconda

ppericard commented 4 years ago

On the test dataset, it takes about twice the time for contigs mapping with biocore/sortmerna than ppericard/sortmerna. Therefore, for now we will keep ppericard/sortmerna as a github dependency until SortMeRNA 3 or 4 is available via bioconda.

As for the bioconda recipe of the current MATAM version, we could either keep listed the bioconda sortmerna package as a dependency, or we could compile our modified version of sortmerna and make it available as a binary of the MATAM conda package.

ppericard commented 4 years ago

Maybe that before that we will have replaced SortMeRNA by Minimap2 for contigs mapping, as listed in issue #87

loic-couderc commented 4 years ago

To be more precise, the conda recipes up to v1.5.3 embedded the ppericard/sortmerna. It's only since the v1.6.0 that we are using the bioconda/sortmerna.