Ecogenomics / GTDBTk

GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
https://ecogenomics.github.io/GTDBTk/
GNU General Public License v3.0
478 stars 82 forks source link

Add environment variable for mash database #607

Open fplazaonate opened 3 weeks ago

fplazaonate commented 3 weeks ago

Hi,

We have installed gtdbtk on our servers but the users do not necessarily know were are the gtdbtk reference data and the mash index.

For the gtdbtk reference data, we can set and export the environment variable GTDBTK_DATA_PATH. Yet, it doesn't fix the issue for the mash index.

What about adding another environment variable for the mash index?

pchaumeil commented 2 weeks ago

Hi Florian,

FYI, we are currently in the process of removing Mash for the GTDB pipeline and migrate to only skani in the future.

This change is currently being tested, and if the results are conclusive, Mash will no longer be included in the next releases of GTDB-Tk.

Cheers, Pierre

fplazaonate commented 2 weeks ago

Hi Pierre,

I guess the issue would be the same with skani if gtdbtk uses a database of pre-sketched genomes?

pchaumeil commented 1 week ago

Hi Florian, This is correct :) We'll work on making this feature available in the next release of GTDB-Tk.