pierrepo / PBxplore

A suite of tools to explore protein structures with Protein Blocks :snake:
https://pbxplore.readthedocs.org/en/latest/
MIT License
28 stars 17 forks source link

Make PBcount more modular #49

Closed jbarnoud closed 9 years ago

jbarnoud commented 9 years ago

This pull request divides PBcount into functions. It also move the generic functions to PBlib. Thus, this pull request improves PBxplore modularity, and makes the calculation of an occurence matrix available from the library. This improved modularity is a step toward the #25 proposal.

Note that this pull request expect the pull request #48 to be merged, even if there is not a strict dependency. Indeed, #48 defines regression tests for PBcount, and these tests should pass with the changes introduced here.

Four new functions appear in PBlib:

In addition to these functions, the pull request introduces two new exceptions:

It is now possible to read a PDB file, to assign the corresponding PB sequence, and to calculate an occurence matrix using the PBxplore python library:

import PDBlib as PDB
import PBlib as PB
pb_seq = []
pdb = PDB.PDB('demo1/2LFU.pdb')
for chain in pdb.get_chains():
    dihedrals = chain.get_phi_psi_angles()
    pb_seq.append(PB.assign(dihedrals,
                            PB.REFERENCES))
pb_count = PB.count_matrix(pb_seq)

In the exemple above, pb_seq is the list of the PB sequence for each model in 2LFU.pdb, and pb_count is the corresponding occurence matrix. In the occurence matrix, each row corresponds to a position, and each column corresponds to a PB (in alphabetical order).