pierrepo / PBxplore

A suite of tools to explore protein structures with Protein Blocks :snake:
https://pbxplore.readthedocs.org/en/latest/
MIT License
28 stars 17 forks source link

PBstat with `--map` option crash when the number of residus is < 20 #146

Closed HubLot closed 7 years ago

HubLot commented 7 years ago

With a PB count file with less than 20 residus, the map generation is failing :

$ PBstat -f 19residus.count -o 19residus --map
Index of first residue is: 1
Traceback (most recent call last):
  File "pbxplore/bin/PBstat", line 9, in <module>
    load_entry_point('pbxplore', 'console_scripts', 'PBstat')()
  File "pbxplore/scripts/PBstat.py", line 173, in pbstat_cli
    pbx.analysis.plot_map(file_fig_name, count, residue_min, residue_max)
  File "pbxplore/analysis/visualization.py", line 168, in plot_map
    fig.text(0.01 + text_shift * 2, 0.5, "PBs", rotation=90, weight="bold",
UnboundLocalError: local variable 'text_shift' referenced before assignment
HubLot commented 7 years ago

It's due to this code in visualization.py :

    # print "beta-strand", "coil" and "alpha-helix" text
    # only if there is more than 20 residues
    if nb_residues >= 20:
        text_shift = 0.0
        if nb_residues >= 100:
            text_shift = scaling_factor / 500
        [...]

    fig.text(0.01 + text_shift * 2, 0.5, "PBs", rotation=90, weight="bold",
             size='larger', transform=ax.transAxes)

Why do we print secondary structures names only when the number of residus is greater than 20? I mean, the y-axe size (list of PB) is always the same.

One solution is to remove this threshold and always print the secondary structure (or initialize text_shift before the first if...)

pierrepo commented 7 years ago

The width of the y-axis margin -- where secondary structure names are printed -- is proportional to the overall length of the graph. The latter is also proportional to the number of amino acids. If the number of amino acids is too low (typically under 20), the y-axis margin is too narrow and you should not print secondary structure names. I am implementing a new way to draw this plot.

HubLot commented 7 years ago

Okay, thanks for the explanation. I'll check the PR