transcript / samsa2

SAMSA pipeline, version 2.0. An open-source metatranscriptomics pipeline for analyzing microbiome data, built around DIAMOND and customizable reference databases.
GNU General Public License v3.0
54 stars 36 forks source link

full_database_download.bash which versions? #69

Closed agavriilidou closed 3 years ago

agavriilidou commented 3 years ago

Hi, Could you please let me know which versions of the databases are downloaded using the full_database_download.bash script?

Thanks. MG

transcript commented 3 years ago

Hello,

You can see the database versions on the Zenodo site, here: https://zenodo.org/record/5022377/

For clarity, and others who may find this, they are:

SEED Subsystems database in Fasta format Sept 22, 2017, subsys_db.fa.bz2

DIAMOND version of SEED Subsystems database Sept 22, 2017, subsys_db.fa.bz2

RefSeq bacterial sequences in Fasta format, Sept 26, 2017, RefSeq_bac.fa.bz2

DIAMOND version of RefSeq bacterial sequences, Sept 26, 2017, RefSeq_bac.dmnd.bz2

SEED Subsystems is, sadly, a dead database and won't be providing additional updates, but you can certainly choose to pull an updated version of the RefSeq bacterial sequences from NCBI's FTP site and compile as an updated version for organism or specific function annotations.

-Sam