raghavagps / transfacpred

An ensemble method for predicting transcription factor in protein sequences
http://webs.iiitd.edu.in/raghava/transfacpred
GNU General Public License v3.0
4 stars 1 forks source link
alignment-free bioinformatics blast hybrid-approach machine-learning-algorithms transcription-factors

TransFacPred

A highly accurate method to predict the transcription factors using protein sequences.

Introduction

TransFacPred is developed for predicting the transcription factors (TFs) using the protein primary sequence information. In this approach, Hybrid model was implemented in which is a combination of ET-based model and BLAST Search. TransFacPred is also available as web-server at https://webs.iiitd.edu.in/raghava/transfacpred. Please read/cite the content about the TransFacPred for complete information including algorithm behind TransFacPred.

PIP Installation

PIP version is also available for easy installation and usage of this tool. The following command is required to install the package

pip install transfacpred

To know about the available option for the pip package, type the following command:

transfacpred -h

Standalone

The Standalone version of transfacpred is written in python3 and following libraries are necessary for the successful run:

Important: NCBI-BLAST version 2.2.29+ is strongly recommended for optimal compatibility with the hybrid model, as it was developed using this specific version. Utilizing a different version may lead to inconsistencies between the outputs obtained from the web server and the standalone application.

Important Note

Minimum USAGE

To know about the available option for the stanadlone, type the following command:

python transfacpred.py -h

To run the example, type the following command:

python3 transfacpred.py -i protein.fa

This will predict if the submitted sequences are TFs or Non-TFs. It will use other parameters by default. It will save the output in "outfile.csv" in CSV (comma seperated variables).

Full Usage

usage: transfacpred.py [-h] 
                       [-i INPUT 
                       [-o OUTPUT]
                       [-t THRESHOLD]
                       [-d {1,2}]
Please provide following arguments for successful run

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Input: File name containing protein or peptide sequence in FASTA format.
  -o OUTPUT, --output OUTPUT
                        Output: File for saving results by default outfile.csv
  -t THRESHOLD, --threshold THRESHOLD
                        Threshold: Value between -0.5 to 1.5 by default -0.38
  -d {1,2}, --display {1,2}
                        Display: 1:Transcription Factors, 2: All Sequences, by default 1

Input File: It allow users to provide input in the FASTA format.

Output File: Program will save the results in the CSV format, in case user do not provide output file name, it will be stored in "outfile.csv".

Threshold: User should provide threshold between -0.5 and 1.5.

Reference

Patiyal S, Tiwari P, Ghai M, Dhapola A, Dhall A and Raghava GPS (2024) A hybrid approach for predicting transcription factors. Front. Bioinform. 4:1425419.