Closed HubLot closed 8 years ago
Seems good to me @HubLot !
Quite a huge commit here. The details are in the commit message but here a few more explanations:
Now, all the visualization functions (neq, map and weblogo) have been moved to PBlib.py
and use as an input the output of count_matrix
in PBlib.py
so an user can (as state previously):
# An example
import PBlib as PB
chains = PDB.chains_from_files(["demo2_tmp/psi_md_traj_1.pdb"])
seqs = []
for comment, chain in chains:
dihedrals = chain.get_phi_psi_angles()
sequence = PB.assign(dihedrals)
seqs.append(sequence)
pb_count = PB.count_matrix(seqs)
neq = PB.compute_neq(pb_count)
# Modification of the parameter's order
PB.plot_neq("psi_md_neq.png", neq)
# For sub plots
PB.plot_neq("psi_md_neq_1-10.png",neq, residue_min=1, residue_max=10)
I have also added to functions to the library :
compute_freq_matrix
which compute a frequency matrix needed for neq and map from an occurrence matrix. It's called from the visualization functions to be transparent for the API user.
The main issue I encountered is how to deal with the residue-min
and residue-max
options from the PBstat.py
CLI and how to slice correctly the values for the visualization functions. Because from an API user, the PB.count_matrix()
doesn't deal with residues indexes. The new function _slice_matrix(matrix, residue_min=1, residue_max=None)
aims to resolve this problem by slicing a given matrix with the boundaries given in parameters. It ensures the boundaries is good and returned the sliced matrix.
This function is only called by the visualizations one (write_neq
, plot_neq
, plot_map
) because it's not worth it to handle sliced matrices (with correct offset and max boundaries) for the whole API.
Now, the PBstat.py
is shorter and doesn't differ from the API.
However, the creation of the weblogo image is still done with the old way (that's why there is a check_residue_range
function inside the module. The reason is that I would like to use the weblogo API instead of the binary to generate images (see #78)
:+1:
At one point we should split PBlib into several files. I do not like to require matplotlib to do a PB assignation. This is out of the scope of this PR, though.
The last commit solve the issue #78. Now, the weblogo images are generated from the API of weblogo.
The call to the function generate_weblogo
is no different from the API.
This function and read_occurence_matrix
have been moved to PBlib.py
With this last commit, the PR can now be reviewed and merged if it's ok.
I agree with @jbarnoud, now that almost all functions are in PBlib.py, we need to split it in several files as proposed in #25.
Thanks guys! Indeed PBlib is becoming bigger and bigger...
This pull request aims to modularize the code of
PBstat.py
and create an API (as proposed in #25) to use it as a library. This is a Work In Progress so don't merge it until the work is done.The goal is to create visualization functions (neq, map and weblogo) that use as an input the output of
count_matrix
inPBlib.py
so an user can :What do you think ?
So far, I have only split the module into functions to avoid floating code.
ping @jbarnoud