katholt / srst2

Short Read Sequence Typing for Bacterial Pathogens
Other
125 stars 65 forks source link

Adding utility bash scripts for automating VFDB gene database generation #23

Closed ppcherng closed 9 years ago

ppcherng commented 9 years ago

I added two bash scripts that automate the generation of the VFDB gene databases following the instructions from here: https://github.com/katholt/srst2#using-the-vfbd-virulence-factor-database-with-srst2

get_all_vfdb.sh generates the gene databases for ALL genera found in CP_VFs.ffn. Usage: bash /srst2/database_clustering/get_all_vfdb.sh ${VFDB_ffn_file} ${OutputDir} Example usage: bash /srst2/database_clustering/get_all_vfdb.sh ./CP_VFs.ffn ./VFDB_geneDB

get_genus_vfdb.sh generates the gene database for a single specified genus found in CP_VFs.ffn. Usage: bash /srst2/database_clustering/get_genus_vfdb.sh ${VFDB_ffn_file} ${Genus} ${OutputDir} Example usage: bash /srst2/database_clustering/get_genus_vfdb.sh ./CP_VFs.ffn Bacillus ./Bacillus_VF_geneDB

These two scripts need to live in the same folder as all the other python scripts in /srst2/database_clustering/

These scripts also require that cd-hit is installed to $PATH somewhere