For exchanging material and doc
ORION: fold recognition method based on profile-profile alignments
scoringProfiles: methods to score profile-profile alignments
HHsearch: remote homology detection using profiles (defined from hidden Markov models, not from PSSM!)
CLADE: remote homology detection using a multi-profile strategy
DOPE: "historical" statistical potential for evaluating the quality of 3D models (does not perform very well but is a good start)
Rosetta: a reference function for estimating the energy of a 3D protein conformation (be aware it's all-atom!)
SBROD: coarse-grained statistical potential to evaluate 3D models quality (can be applied to Calpha-only or backbone-only structures, recently performed well in CASP13, can be completely re-trained!)
ReviewSA: book chapter reviewing methods to predict secondary structure, solvent accessibility, torsional angles and contact maps, from sequence information)
CCMPRED: method to predict protein-protein contact by extracting coevolution signals (co-occurring patterns of mutations across sequences)
DaReUS-Loop: Webserver for accurate modeling of loops in homology models
Params: directory containing values for the DOPE statistical potential.
Tools: directory containing a PDB parser (in Python) and a script to weight sequences based on their similarity (in Perl).
For each family:
FASTA file containing the master (reference) sequence,
MAP file containing a multiple sequence alignment with the master sequence and homologous (or related) sequences,
SCOP_ID file containing the SCOP identifier(s) of the family,
PDB file containing the 3D coordinates of the master sequence. Please note that the structure may contain "holes" (missing residues that either could not be resolved, or were modified/non-canonical).
They are contained in the file queries398.multifasta. The name of each query sequence is as follows:
gluts | Q09596.1 | NP_001254267.1 | 98.0%
gluts: name of the family you should find!
Q09596.1: UNIPROT code
NP_001254267.1: sequence identifier (you do not care about it)
98.0%: max percentage of identity between the query and all sequences from the family you should find! It gives an idea of the level of difficulty associated to the query, but in principle you do not need to use this value in your calculations