jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
374 stars 80 forks source link

Support for EggNOG version 5.0.2 #506

Open mdehollander opened 2 years ago

mdehollander commented 2 years ago

Is it possible to use the latest version of the eggNOG database, version 5.0.2, with SqueezeMeta? It looks like the current used version is 4.5 (https://github.com/jtamames/SqueezeMeta/issues/372#issuecomment-941170659), which has been released in 2016. This is the version that comes with the download script from squeezemeta.

If the make_databases.pl script is used to build the diamond database with the make_eggnog_db.pl script, also version 4.5 is used: https://github.com/jtamames/SqueezeMeta/blob/master/lib/install_utils/make_eggnog_db.pl#L23 The input file is from 2015:

http://eggnog5.embl.de/download/eggnog_4.5/eggnog4.proteins.all.fa.gz                        18-Mar-2015 10:41      3G

The eggnog 5.0.2 diamond contains 21841973 sequences, where the provided eggnog database from squeezemeta contains 7749941 sequences.

Can the output of the https://github.com/eggnogdb/eggnog-mapper/blob/master/download_eggnog_data.py script be used with squeezemeta, so that the latest eggnog db is used?

jtamames commented 2 years ago

Sure. I was working on that, formatted the eggnog5 database, but forgot to finish and upload it. I will go on working on that in the coming days. Best, J

jtamames commented 2 years ago

In any case, remember that you can easily add and use any database you want. Please check the manual for instructions on how to do so. Best, J