srjun / panfp

GNU General Public License v3.0
2 stars 0 forks source link

PanFP is a Python pipeline to predict pangenome-based functional profiles for microbial communities.

Requirements

Specific libraries are required by PanFP. We provide a requirements file to install everything at once. To do so, you will need first to have pip installed and then run:

pip3 --version                      # Check if installed
sudo apt-get install python3-pip    # if you need to install pip, you can check installation with the previous command
pip3 install -r requirements.txt

Installation & Help

Download this repository and run:

python3 setup.py install

You may require to call it using sudo. Once installed, panfp`should be available anywhere in your terminal.

In the case you need to install the package in a specific directory of your system, you can call the argument --install-lib followed by a directory path:

python3 setup.py install --install-lib /custom/path/

Example

Requirements to run an experiment are:

-d [database of reference genomes with functional annotation] [here]
-a [directory which contains functional profiles of genomes in database] [here]
-i [otu-sample table]

To see additional arguments:

bin/panfp --help

As example, we included an example script [here] with a full workflow of how panfp works and an example otu-sample table [here].

Note that an input, otu-sample table should be in a tab delimited format as follows:

#OTU ID S1 S2 ... S10 Lineage
OTU_1 0.0 10.0 ... 2.0 kBacteria; pProteobacteria; cBetaproteobacteria; oMND1; f__
OTU_2 4.0 430.0 ... 24.0 kBacteria; pProteobacteria; cBetaproteobacteria; o; f; g; s__
... ... ... ... ... kBacteria;pCyanobacteria;c__Oxyphotobacteria
OTU_99 1.0 5.0 ... 0.0 kBacteria;pChloroflexi;c__
OTU_100 0.0 35.0 ... 2.0 kBacteria; pProteobacteria; cGammaproteobacteria; oEnterobacteriales; fEnterobacteriaceae; gGluconacetobacter; s__liquefaciens

where the first column represents OTU ids, numbers represent raw frequency of 16S rRNA, and the last column represents lineage of OTUs.

As example, we included an example script [here] with a full workflow of how panfp works and an example otu-sample table [here].

Output Information:

The following files are generated in the following order:

Contact

This project has been fully developed at the group of Translational Bioinformatics - Jun Lab.

If you experience any problem at any step involving the program, you can use the 'Issues' page of this repository or contact: Se-Ran Jun

License

PanFP is under a common GNU GENERAL PUBLIC LICENSE. Plese, check LICENSE for further information.

[2020] - Se-Ran Jun - All Rights Reserved*