pierrepo / PBxplore

A suite of tools to explore protein structures with Protein Blocks :snake:
https://pbxplore.readthedocs.org/en/latest/
MIT License
28 stars 17 forks source link

Fix issue with weblogo and large counts #33

Closed jbarnoud closed 9 years ago

jbarnoud commented 9 years ago

If the first count of a line in PBcount output is greater than 9999, then the count is written on 5 characters. This causes the first count to be written right next to the residue number in the transfac file generated for weblogo. Therefore, the residue number and the first count are read as a single field by the weblogo transfac parser, and the number of read fields is wrong.

This pull request adds a single space after the residue number when writing a transfac file. As a consequence, the residue number and the first count cannot be read together as a single field anymore, whatever length is the first count. The pull request also moves the generation of transfac files from PBstat main body to a function in PBlib, and adds a test for this function in test_functions.

The change has been tested with the files provided by @sleonard0386 in issue #27 and with weblogo 3.4 (2014-06-02).

Because the fields are space separated already in the output of PBcount, the other fields do not have an issue when their value is large.

This commit may fix issue #27, yet the error message Sylvain reported seems to come from an earlier version of weblogo.