PGPg Finder is a versatile command-line tool designed for the annotation of Plant-Growth Promotion Traits (PGPT) in both genomes and metagenomes. Rooted in the comprehensive PLant-associated BActeria web resource database (PLaBAse), this tool boasts flexibility and user-friendliness. PGPg Finder offers a range of pre-configured workflows, enabling users to select the tools and processes best suited to their specific research needs.
The pipeline allows you to choose from several workflows based on the tools and steps you need. The workflows currently available are:
genome_wf: A genome annotation pipeline that uses Prodigal for gene prediction and DiAMOND against PLaBAse.
meta_wf: A metagenome workflow with megahit assembly, Prodigal gene prediction, DIAMOND annotation, Bowtie2 mapping, samtools quantification.
metafast_wf: A metagenome workflow with PEAR assembly and direct DIAMOND annotation.
Each workflow is designed to be run with a specific set of input files and produces a set of output files. The tool also supports running on multiple threads to improve performance.
To use the tool, you first need to clone the repository and ensure that all dependencies are installed.
git clone https://github.com/tpellegrinetti/PGPg_finder/
After this procedure, you need to performe the installation of dependancies with conda (recomended). The conda will create a separated environment called PGPg_finder.
bash install.sh
If conda not work for you, you can install the dependences mannualy without a separeted environment and run PGPg_finder (not recomended).
if you want to learn how to use the PGPg_finder, click the link below:
You can run the PGPg_finder tool using the following command:
python PGPg_finder.py -w workflow -i input_directory -o output_directory -t threads
the -a argument is optional and works if you want to provide assembly files
1) Running PGPg_finder with genomes:
python PGPg_finder.py -w genome_wf -i /path/to/fasta/folder/ -o /path/to/your/desired/out/ -t 12
2) Running PGPg_finder with metagenomes:
a) Fast way (less accurated)
python PGPg_finder.py -w metafast_wf -i /path/to/fasta/folder -o /path/to/your/desired/out/ -t 12
b) Slow way (more accurated)
python PGPg_finder.py -w meta_wf -i /path/to/fasta/folder -o /path/to/your/desired/out/ -t 12
Here you can provide your metagenome assemblies with -a option
This tool depends on the following libraries and tools:
Please, note that this pipeline was developed based on a curated database called "PLant-associated BActeria web resource (PLaBAse)". Acess PLaBAse website: https://plabase.cs.uni-tuebingen.de/
We encourage you to cite the PLaBAse:
Patz S, Rauh M, Gautam A, Huson DH. mgPGPT: Metagenomic analysis of plant growth-promoting traits.(submitted, 2024, preprint)
Patz S, Gautam A, Becker M, Ruppel S, Rodríguez-Palenzuela P, Huson DH. PLaBAse: A comprehensive web resource for analyzing the plant growth-promoting potential of plant-associated bacteria. (submitted 2021, preprint)
If PGPg_finder was useful for you please cite us:
Pellegrinetti, TA; Monteiro, G; Lemos, LN; RAC, Santos; Barros, A; Mendes, L. (2024) PGPg_finder: A Comprehensive and User-friendly Pipeline for Identifying Plant Growth-Promoting Genes in Genomic and Metagenomic Data. Rhizosphere.
If you are having trouble downloading the database, you can download it manually here:
Place these files in the database folder of PGPg_finder and run the following commands:
diamond makedb --in PGPT_BASE_nr_Aug2021n_ul_1.fasta.gz --db genome
diamond makedb --in mgPGPT-db_Feb2022_ul_dwnld.fasta.gz --db metagenome
You will get two .dmnd files. If desired, you can remove the .fasta.gz files
If you have problems to download the PLaBAse database, or any other installation issues, please contact us.
For dedicated support with running PGPg_finder, please contact: tpellegrinetti@usp.br