realbigws / RaptorX_Property_Fast

RaptorX-Property: a Standalone Package for Protein Structure Property Prediction
15 stars 9 forks source link

=========== WARNING!!!!

This package is out-of-date. It has been splitted into two packages:

1) TGT_Package https://github.com/realbigws/TGT_Package

2) Predict_Property https://github.com/realbigws/Predict_Property

The TGT_Package is used for generating the TGT file from a given sequence in FASTA format.

The Predict_Property is used for predicting protein local properties from a given TGT file or FASTA file.

Thus, the RaptorX_Property_Fast module won't be updated anymore.

=================================== RaptorX Property Standalone Package (v1.02) 2018.05.18

Title: RaptorX-Property: a Standalone Package for Protein Structure Property Prediction

Author: Sheng Wang

Email: realbigws@gmail.com

============ Publication:

[1] RaptorX-Property: a Web Server for Protein Structure Property Prediction Sheng Wang#, Wei Li, Shiwang Liu, Jinbo Xu# Nucleic Acids Research, 2016

[2] Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields Sheng Wang#, Jian Peng, Jianzhu Ma, Jinbo Xu# Scientific Reports, 2016

[3] AUCpreD: Proteome-level Protein Disorder Prediction by AUC-maximized Deep Convolutional Neural Fields Sheng Wang#, Jianzhu Ma, Jinbo Xu# ECCB, 2016 Bioinformatics, 2016

[4] AUC-maximized Deep Convolutional Neural Fields for Protein Sequence Labeling Sheng Wang, Siqi Sun, Jinbo Xu# ECML/PKDD, 2016

======== Install:

  1. download the package

git clone https://github.com/realbigws/RaptorX_Property_Fast cd RaptorX_Property_Fast


  1. compile

cd source_code/ make cd ../


  1. setup

./setup.pl

[note]: before you run anything, type the above command for configuration just for once.

========= Database:

  1. if databases/uniprot20 not exist, create it by

mkdir -p databases/uniprot20

  1. download the UniProt20 database from the following link:

http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/old-releases/uniprot20_2016_02.tgz

uncompressed it, and move all files or symbol link to databases/uniprot20_2016_02

  1. if other version of UniProt20 is applied, then use '-d uniprot20_XXXX' option in ./Fast_TGT.sh

  2. note that the new UniClust30 database could also be applied, such as: http://wwwuser.gwdg.de/~compbiol/uniclust/2017_10/uniclust30_2017_10_hhsuite.tar.gz

======= Server:

  1. Please try to use our RaptorX Property server at: http://raptorx.uchicago.edu/StructurePropertyPred/predict/

  2. Users may also use the 'curl' command to submit their jobs. For example, curl --form jobname=test_job --form email=user@domain --form sequences=ENIEVHMLNKGAEGAMVFEPAYIKANPGDTVTFIPVDKGHNVESIKDMIPEGAEKFKSKINENYVLTVTQPGAYLVKCTPHYAMGMIALIAVGDSPANLDQIVSAKKPKIVQERLEKVIASAK http://raptorx.uchicago.edu/StructurePropertyPred/curl/

[note]: (i) the 'email' is NOT required. (ii) the 'job_url' and 'down_url' will be displayed after the 'curl' command is executed.

  1. I have also implemented a simplified and command-line version of 'curl'. See below, https://github.com/realbigws/curl_command

====== Usage:

./oneline_command.sh [cpu_number] [PROF_or_NOT]


Here the first input argument is the input sequence in FASTA format, the second input is the CPU number, the third input is to use profile or not.


./PDBTM_Topology_Pred.sh -i example/1bhaA.tgt -l example/1bhaA.pdbtm

The module to predict transmembrane topology.

================ Running example:

  1. use sequence profile: cp example/T0530.fasta . cp example/T0530.tgt . ./oneline_command.sh T0530.fasta tmp 1 1

or, ./oneline_command.sh example/T0530.fasta tmp 1 1


  1. not use sequence profile: ./oneline_command.sh example/T0530.fasta tmp 1 0

The output files should be found in the tmp/T0530/ folder

============= Output files:

  1. overall detailed results: SeqID.all

e.g., file 'T0530.all' in tmp/T0530/ folder. This file contains all the detail prediction results for Secondary Structure Element (SS3 and SS8), Solvent Accessbility (ACC), and Order/Disorder prediction (DISO)


  1. detailed results in separate files: SeqID.ss3 SeqID.ss8 SeqID.acc SeqID.diso

These files contain the detail prediction results in the form of probability.


  1. simple results in separate files: SeqID.ss3_simp SeqID.ss8_simp SeqID.acc_simp SeqID.diso_simp

These files contain the simple predicion results in onee line.