i2bc / SURFMAP

Other
20 stars 3 forks source link

[Bug]: No PDF results #23

Closed jptamby closed 1 week ago

jptamby commented 2 weeks ago

Operating System

Unix (e.g., Ubuntu 20.04)

Version

2.2.0, 2.1.0, 2.0.0

Python Version (optional)

3.8.10

Python Virtual Environment

venv/virtualenv/other

Execution Environment

Local environment after installation of all external dependencies

Bug Description

I run the command line: surfmap -pdb Q99252.pdb -tomap kyte_doolittle -verbose 0 In the result directory "output_SURFMAP_Q99252_kyte_doolittle", I just see the file surfmap.log and a sub-directory "shells" containing Q99252.face, Q99252.vert, Q99252.xyzr and Q99252.csv. (the csv file is empty). No pdf files are generated.

Steps to Reproduce

surfmap -pdb Q99252.pdb -tomap kyte_doolittle -verbose 2 --keep

Relevant Log Output

kyte_doolittle
2024-06-26 19:23:22 — INFO — surfmap.lib.core.surfmap_from_pdb ........................ SURFACE MAPPING OF THE KYTE_DOOLITTLE PROPERTY
2024-06-26 19:23:22 — INFO — surfmap.lib.core.surfmap_from_pdb ........................ Step 1: computing a shell around the protein surface
2024-06-26 19:23:22 — DEBUG — surfmap.tools.compute_shell.run ........................... Convert pdb to xyzr format (expected MSMS input)
2024-06-26 19:23:22 — DEBUG — surfmap.tools.compute_shell.run ........................... Running MSMS command: /home/IPS2/jtamby/usr/venv-surfmap/lib/python3.8/site-packages/surfmap/utils/MSMS/msms -if output_SURFMAP_Q99252_kyte_doolittle/shells/Q99252.xyzr -of output_SURFMAP_Q99252_kyte_doolittle/shells/Q99252
2024-06-26 19:23:22 — DEBUG — surfmap.tools.compute_shell.run ........................... Convert MSMS .vert file into CSV format

SURFMAP: Projection of protein surface features on 2D map
Copyright (C) 2021  H. Schweke
Version: 2.1.0

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
--------------------------------------------------------------------------------
SURFMAP relies on the use of MSMS (version 2.6.1) - Copyright (c) M. F. Sanner, 1994.

As a corollary of MSMS's license terms and conditions, by using SURFMAP,
you agree to acknowledge the use of the MSMS software that results in any 
published work, including scientific papers, films and videotapes by 
citing the following reference:

    Sanner, M.F., Spehner, J.-C., and Olson, A.J. (1996) Reduced surface:
    an efficient way to compute molecular surfaces. Biopolymers, Vol. 38.,
    (3), 305-320.

MSMS is free for academic use. For commercial use please contact
M. F. Sanner at sanner@scripps.edu.
--------------------------------------------------------------------------------
If you use SURFMAP for your research, please cite the following
papers:
    - Hugo Schweke, Marie-Hélène Mucchielli, Nicolas Chevrollier,
      Simon Gosset, Anne Lopes (2021) SURFMAP: a software for mapping
      in two dimensions protein surface features. bioRxiv
      2021.10.15.464543; doi: https://doi.org/10.1101/2021.10.15.464543

    - Sanner, M.F., Spehner, J.-C., and Olson, A.J. (1996) Reduced 
      surface: an efficient way to compute molecular surfaces. 
      Biopolymers, Vol. 38., (3), 305-320.
--------------------------------------------------------------------------------

Traceback (most recent call last):
  File "/home/IPS2/jtamby/usr/venv-surfmap/bin/surfmap", line 8, in <module>
    sys.exit(main())
  File "/home/IPS2/jtamby/usr/venv-surfmap/lib/python3.8/site-packages/surfmap/bin/surfmap.py", line 63, in main
    surfmap_local(params=params)
  File "/home/IPS2/jtamby/usr/venv-surfmap/lib/python3.8/site-packages/surfmap/bin/surfmap.py", line 18, in surfmap_local
    surfmap_from_pdb(params=params)
  File "/home/IPS2/jtamby/usr/venv-surfmap/lib/python3.8/site-packages/surfmap/lib/core.py", line 167, in surfmap_from_pdb
    csv_coords, shell = run_compute_shell(pdb_filename=params.pdbarg, out_dir=outdir_shell, extra_radius=extra_radius)
  File "/home/IPS2/jtamby/usr/venv-surfmap/lib/python3.8/site-packages/surfmap/tools/compute_shell.py", line 117, in run
    vert2csv(vertfile=outfile_vert, outfile=outfile_csv, skiplines=list(range(3)))
  File "/home/IPS2/jtamby/usr/venv-surfmap/lib/python3.8/site-packages/surfmap/tools/compute_shell.py", line 72, in vert2csv
    for i, line in enumerate(_readfile):
  File "/usr/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in position 177: invalid continuation byte
file: /home/IPS2/jtamby/usr/test_surfmap/Sce_random_data/Q99252.pdb

Additional context (optional)

Q99252.pdb 742K

Confirmation

nchenche commented 2 weeks ago

Hello @jptamby,

Thanks for posting your issue here.

The output is not as expected because an error occured during computation. It seems that the issue could be related to your pdb file which might contain unexpected characters or encoding.

Can you please upload you pdb file here?

jptamby commented 2 weeks ago

@nchenche Here is the pdb file (I added a .txt to be able to upload it) Q99252.pdb.txt Thank you

nchenche commented 2 weeks ago

ERRATUM

Dear @jptamby,

SURFMAP actually handles multi-chain pdb files, I am really sorry I answered you too quickly in my precedent comment (now crossed below).

I have just run SURFMAP with your pdb file and no problem, computation has successfully completed (see below). So right now I don't understand the error you are coping with, I cannot reproduce it.

If you have a docker engine installed on your machine, could you try with the --docker option: surfmap -pdb Q99252.pdb -tomap kyte_doolittle --docker?

Although this does not solve the problem, it would allow you to run your calculations.

Sorry again,

Bests

Successful SURFMAP run

(surfmap_2.2.0) nchenche@nchenche-laptop:~/surfmap_tests/issue_23$ ll
total 752
drwxrwxr-x 2 nchenche nchenche   4096 juin  27 21:37 ./
drwxrwxr-x 9 nchenche nchenche   4096 juin  27 21:22 ../
-rw-rw-r-- 1 nchenche nchenche 758889 juin  27 21:23 Q99252.pdb
(surfmap_2.2.0) nchenche@nchenche-laptop:~/surfmap_tests/issue_23$ surfmap -pdb Q99252.pdb -tomap kyte_doolittle

SURFMAP: Projection of protein surface features on 2D map
Copyright (C) 2021  H. Schweke
Version: 2.2.0

...
--------------------------------------------------------------------------------

SURFACE MAPPING OF THE KYTE_DOOLITTLE PROPERTY
Step 1: computing a shell around the protein surface
Step 2: computing the property values and/or assign it to the shell particles
Step 3: computing the 2D sinusoidal projection coordinates of each shell particle
Step 4: dividing the 2D projection into 72x36 cells and smoothing the values
Step 5: computing the 2D map

(surfmap_2.2.0) nchenche@nchenche-laptop:~/surfmap_tests/issue_23$ ll
total 756
drwxrwxr-x 3 nchenche nchenche   4096 juin  27 21:38 ./
drwxrwxr-x 9 nchenche nchenche   4096 juin  27 21:22 ../
drwxrwxr-x 4 nchenche nchenche   4096 juin  27 21:38 output_SURFMAP_Q99252_kyte_doolittle/
-rw-rw-r-- 1 nchenche nchenche 758889 juin  27 21:23 Q99252.pdb
(surfmap_2.2.0) nchenche@nchenche-laptop:~/surfmap_tests/issue_23$ tree output_SURFMAP_Q99252_kyte_doolittle/
output_SURFMAP_Q99252_kyte_doolittle/
├── maps
│   └── Q99252_kyte_doolittle_map.pdf
├── parameters.log
├── smoothed_matrices
│   └── Q99252_kyte_doolittle_smoothed_matrix.txt
└── surfmap.log






Precedent answer (inappropriate)

Thank you @jptamby, this should help to resolve the issue.

Indeed your pdb file is a multi-chain pdb file (it contains 2 chains: A and B), and SURFMAP will only work with single chain pdb files.

You can use the command extract_interface included in SURFMAP. This command allows to find the interface residues between a given chain (or set of chains) and all the other chains of the input PDB structure. The command outputs a new PDB file of the given chain(s) with the expected format for the -tomap binding_sites option. More information here.

So, to extract the chain A, use extract_interface -pdb Q99252.pdb -chains A.

This will generate Q99252_chain-A_bs.pdb, a pdb file of the chain A of your original pdb file in a format that will be compatible with SURFMAP (bfactor column will contain discrete values: 1 for atoms involved in an interface with a chain different from chain A, 0 otherwise - if interested, you can map them with the SURFMAP option -tomap binding_sites ; if not interested, be aware that those values have no influence on property computations, like kyte_doolitttle or others).

(surfmap.2.2.0) nche@nche-Precision-7920-Tower:~/surfmap_test/issues/23$ ll
total 752
drwxrwxr-x 2 nche nche   4096 juin  27 11:40 ./
drwxrwxr-x 3 nche nche   4096 juin  27 11:02 ../
-rw-rw-r-- 1 nche nche 758889 juin  27 10:43 Q99252.pdb
(surfmap.2.2.0) nche@nche-Precision-7920-Tower:~/surfmap_test/issues/23$ extract_interface -pdb Q99252.pdb -chains A
Chains found in the PDB Q99252: A, B
Target chain(s): A
Interface residues will be searched between the following chains: A and B
(surfmap.2.2.0) nche@nche-Precision-7920-Tower:~/surfmap_test/issues/23$ ll
total 1088
drwxrwxr-x 2 nche nche   4096 juin  27 11:40 ./
drwxrwxr-x 3 nche nche   4096 juin  27 11:02 ../
-rw-rw-r-- 1 nche nche 339593 juin  27 11:40 Q99252_chain-A_bs.pdb
-rw-rw-r-- 1 nche nche    456 juin  27 11:40 Q99252_chain-A_interface.txt
-rw-rw-r-- 1 nche nche 758889 juin  27 10:43 Q99252.pdb
(surfmap.2.2.0) nche@nche-Precision-7920-Tower:~/surfmap_test/issues/23$ surfmap -pdb Q99252_chain-A_bs.pdb -tomap kyte_doolittle

Please let me know if this resolved your problem.

Bests

jptamby commented 1 week ago

Thank you @nchenche

Actually yes, SURFMAP handles multi-chain pdb files. Finally, I found that my problem comes from the length of the pdb file name. When I try to process a file with name Q99252_P32891_unrelaxed_rank_001_alphafold2_multimer_v3_model_2_seed_000.pdb, it can not output the correct results (see error in previous message). When I rename the same file with Q99252_P32891_rank_001_model_2_seed_000.pdb, SURFMAP gives complete results, including the pdf file, with no error. From several tests I done, it appears that a file name of length lesser than 50 characters is ok. So the question: in SurfMap, is there a limitation in the length of the pdb file names ?

nchenche commented 1 week ago

Thank you @jptamby for your feedback.

To my knowledge, the length should not be a problem per se for SURFMAP scripts, but apparently it sounds it might be for internal python scripts like codecs.py. It is good to know, I'll do some tests on my side, thank you again!

Bests

nchenche commented 1 week ago

I have created a new issue from this one here so that people can find it more efficiently.

I am going to close this issue and start patching the error.