Closed HubLot closed 7 years ago
I will work on that and also include a test (which we also forget :smile: )
We cannot solve this with only residue_min
and residue_max
arguments in the plotting functions. If residue_min == 20
, we don't know if the index of first residue (in the count matrix) is 1 or 15 hence, it will affect the output images (either we squeeze 20 residues or 5 respectively).
After some thoughts, the easiest way I found to solve this is to add a optional variable called offset
(set to 1 by default) in each plotting functions (plot_map
, plot_neq
/write_neq
and generate_weblogo
). It will represent the index of the first residue in the count matrix. From it, we can deduce the range of residues the user wants (in conjunction with residue_min
/residue_max
) and create the plot images accordingly.
Hi,
We missed an important bug when writing your visualization library. As you know, sometimes the first residue of a PDB is not 1. We did take that in account in the
PBcount
andwrite_count_matrix()
by adding a number of the first residue as an optional parameter (1 by default). It works great and the count matrix can start with a different number (see this test file for example https://github.com/pierrepo/PBxplore/blob/master/pbxplore/tests/test_data/count_multi1_first20.PB.count).Unfortunately, we didn't take it in account properly for the visualizations tools (
PBstat
and the plot functions) :The issue lies in the
_slice_matrix()
function (inutils.py
) were we mixed the index of the first residue from the count matrix (inPBcount
) and the range of residue optionally provided inPbstat
:The numpy matrix (
mat
) doesn't know about the index of the first residue (and nor it should). It always start at 0. When the first residue (in the count matrix) has an index of 1, it works well. We create a range based on it and slice the matrix according toresidue-min
/residue-max
ofPBstat
(or the optional arguments in the plot functions). However, when it's different of 1 (say 252), we base our slicing on this index. If, for example the PDB is ten-residue long, we try to slice it like that :mat[252-1:262]
.It should fails when the user wants a range of residues (252-262) from
PBstat
on a ten-residue PDB indexed from 1 but not when the ten-residue PBD is indexed from 252.In the traceback I included, the count matrix starts at residue 20 (as stated in the output) and has 89 residues. We want plots for the whole thing. So, in
_slice_matrix()
, our range of residue indexes (residues_idx
) goes from 1 to 89 (which is good, we want the whole matrix) but the next test is failing becauseresidue_max
(i.e 128) is not in this range. It's here we mixed the 2 indexes