LPDI-EPFL / masif

MaSIF- Molecular surface interaction fingerprints. Geometric deep learning to decipher patterns in molecular surfaces.
Apache License 2.0
572 stars 151 forks source link

Error while running on a predicted structure #10

Closed skyungyong closed 4 years ago

skyungyong commented 4 years ago

Hello,

I encountered an error while running data_prepare_one.sh on a predicted protein structure.

The script ran fine on the download pdb file:

Singularity masif_latest:/global/scratch/software/MaSIF/masif/data/masif_site> ./data_prepare_one.sh --file data_preparation/00-raw_pdbs/4ZQK.pdb 4ZQK_A

Running masif site on data_preparation/00-raw_pdbs/4ZQK.pdb cp: 'data_preparation/00-raw_pdbs/4ZQK.pdb' and 'data_preparation/00-raw_pdbs/4ZQK.pdb' are the same file Empty Removing degenerated triangles Removing degenerated triangles 4ZQK_A Reading data from input ply surface files. Dijkstra took 3.65s Only MDS time: 15.50s Full loop time: 24.70s MDS took 24.70s

It also ran fine on a predicted structure I downloaded online:

Singularity masif_latest:/global/scratch/software/MaSIF/masif/data/masif_site> ./data_prepare_one.sh --file data_preparation/00-raw_pdbs/AvrPita.pdb AvrPita_A

Running masif site on data_preparation/00-raw_pdbs/AvrPita.pdb cp: 'data_preparation/00-raw_pdbs/AvrPita.pdb' and 'data_preparation/00-raw_pdbs/AvrPita.pdb' are the same file Empty Removing degenerated triangles Removing degenerated triangles AvrPita_A Reading data from input ply surface files. Dijkstra took 7.01s Only MDS time: 29.31s Full loop time: 46.89s MDS took 46.89s

However, for this structure predicted on my local machine,

Singularity masif_latest:/global/scratch/software/MaSIF/masif/data/masif_site> ./data_prepare_one.sh --file data_preparation/00-raw_pdbs/MGG-01993-ITASSER.pdb MGG-01993-ITASSER_A

Running masif site on data_preparation/00-raw_pdbs/MGG-01993-ITASSER.pdb cp: 'data_preparation/00-raw_pdbs/MGG-01993-ITASSER.pdb' and 'data_preparation/00-raw_pdbs/MGG-01993-ITASSER.pdb' are the same file --Call--

/usr/local/lib/python3.6/subprocess.py(758)del() 756 self.wait() 757 --> 758 def del(self, _maxsize=sys.maxsize, _warn=warnings.warn): 759 if not self._child_created: 760 # We didn't get to successfully create a child process.

ipdb>

I wasn't so sure what was causing this error.... Thank you in advance!

>head AvrPita.pdb ATOM 1 N MET A 1 50.404 53.465 89.261 1.00 13.70 ATOM 2 CA MET A 1 49.060 53.953 88.970 1.00 13.70 ATOM 3 HA MET A 1 48.349 53.550 89.692 1.00 13.70 ATOM 4 CB MET A 1 49.107 55.497 89.071 1.00 13.70 ATOM 5 HB1 MET A 1 49.608 55.899 88.190 1.00 13.70 ATOM 6 HB2 MET A 1 49.694 55.787 89.940 1.00 13.70 ATOM 7 CG MET A 1 47.733 56.158 89.212 1.00 13.70 ATOM 8 HG1 MET A 1 47.334 55.866 90.163 1.00 13.70 ATOM 9 HG2 MET A 1 47.047 55.782 88.460 1.00 13.70 ATOM 10 SD MET A 1 47.708 57.971 89.163 1.00 13.70

>head MGG-011730-ITASSER.pdb ATOM 1 H LEU 1 -30.724 18.366 -0.112 1.00 4.24 ATOM 2 N LEU 1 -30.717 19.328 -0.332 1.00 4.24 ATOM 3 CA LEU 1 -31.127 20.240 0.732 1.00 4.24 ATOM 4 C LEU 1 -30.216 20.107 1.947 1.00 4.24 ATOM 5 O LEU 1 -29.360 19.226 1.982 1.00 4.24 ATOM 6 CB LEU 1 -32.579 19.966 1.133 1.00 4.24 ATOM 7 CG LEU 1 -33.577 20.301 0.018 1.00 4.24 ATOM 8 CD1 LEU 1 -34.987 19.880 0.429 1.00 4.24 ATOM 9 CD2 LEU 1 -33.575 21.804 -0.259 1.00 4.24 ATOM 10 N PRO 2 -30.291 20.947 3.077 1.00 2.47

> tail AvrPita.pdb ATOM 3585 H CYS A 224 76.364 66.692 47.328 1.00 5.19 ATOM 3586 CA CYS A 224 74.809 66.366 45.907 1.00 5.19 ATOM 3587 HA CYS A 224 74.247 65.434 45.807 1.00 5.19 ATOM 3588 CB CYS A 224 73.982 67.329 46.770 1.00 5.19 ATOM 3589 HB1 CYS A 224 73.157 67.727 46.174 1.00 5.19 ATOM 3590 HB2 CYS A 224 74.606 68.173 47.067 1.00 5.19 ATOM 3591 SG CYS A 224 73.262 66.560 48.246 1.00 5.19 ATOM 3592 C CYS A 224 75.018 66.922 44.489 1.00 5.19 ATOM 3593 O CYS A 224 76.123 67.110 43.980 1.00 5.19 TER

>tail MGG-011730-ITASSER.pdb ATOM 661 OG1 THR 82 2.127 -6.129 7.772 1.00 2.03 ATOM 662 CG2 THR 82 0.326 -5.954 6.201 1.00 2.03 ATOM 663 N PRO 83 0.144 -8.497 9.877 1.00 3.59 ATOM 664 CA PRO 83 0.505 -9.336 10.943 1.00 3.59 ATOM 665 C PRO 83 0.584 -10.638 10.319 1.00 3.59 ATOM 666 O PRO 83 0.347 -10.751 9.103 1.00 3.59 ATOM 667 CB PRO 83 -0.615 -9.288 11.984 1.00 3.59 ATOM 668 CG PRO 83 -1.888 -9.067 11.196 1.00 3.59 ATOM 669 CD PRO 83 -1.811 -9.989 9.990 1.00 3.59 TER

pablogainza commented 4 years ago

Hi!

I might need some more information. Could you send me the pdb file?

On Tue, Jun 23, 2020 at 11:41 PM skyungyong notifications@github.com wrote:

Hello,

I encountered an error while running data_prepare_one.sh on a predicted protein structure.

The script ran fine on the download pdb file:

Singularity masif_latest:/global/scratch/software/MaSIF/masif/data/masif_site> ./data_prepare_one.sh --file data_preparation/00-raw_pdbs/4ZQK.pdb 4ZQK_A

Running masif site on data_preparation/00-raw_pdbs/4ZQK.pdb cp: 'data_preparation/00-raw_pdbs/4ZQK.pdb' and 'data_preparation/00-raw_pdbs/4ZQK.pdb' are the same file Empty Removing degenerated triangles Removing degenerated triangles 4ZQK_A Reading data from input ply surface files. Dijkstra took 3.65s Only MDS time: 15.50s Full loop time: 24.70s MDS took 24.70s

It also ran fine on a predicted structure I downloaded online:

Singularity masif_latest:/global/scratch/software/MaSIF/masif/data/masif_site> ./data_prepare_one.sh --file data_preparation/00-raw_pdbs/AvrPita.pdb AvrPita_A

Running masif site on data_preparation/00-raw_pdbs/AvrPita.pdb cp: 'data_preparation/00-raw_pdbs/AvrPita.pdb' and 'data_preparation/00-raw_pdbs/AvrPita.pdb' are the same file Empty Removing degenerated triangles Removing degenerated triangles AvrPita_A Reading data from input ply surface files. Dijkstra took 7.01s Only MDS time: 29.31s Full loop time: 46.89s MDS took 46.89s

However, for this structure predicted on my local machine,

Singularity masif_latest:/global/scratch/software/MaSIF/masif/data/masif_site> ./data_prepare_one.sh --file data_preparation/00-raw_pdbs/MGG-01993-ITASSER.pdb MGG-01993-ITASSER_A

Running masif site on data_preparation/00-raw_pdbs/MGG-01993-ITASSER.pdb cp: 'data_preparation/00-raw_pdbs/MGG-01993-ITASSER.pdb' and 'data_preparation/00-raw_pdbs/MGG-01993-ITASSER.pdb' are the same file --Call--

/usr/local/lib/python3.6/subprocess.py(758)del() 756 self.wait() 757 --> 758 def del(self, _maxsize=sys.maxsize, _warn=warnings.warn): 759 if not self._child_created: 760 # We didn't get to successfully create a child process.

ipdb>

I wasn't so sure what was causing this error.... Thank you in advance!

>head AvrPita.pdb ATOM 1 N MET A 1 50.404 53.465 89.261 1.00 13.70 ATOM 2 CA MET A 1 49.060 53.953 88.970 1.00 13.70 ATOM 3 HA MET A 1 48.349 53.550 89.692 1.00 13.70 ATOM 4 CB MET A 1 49.107 55.497 89.071 1.00 13.70 ATOM 5 HB1 MET A 1 49.608 55.899 88.190 1.00 13.70 ATOM 6 HB2 MET A 1 49.694 55.787 89.940 1.00 13.70 ATOM 7 CG MET A 1 47.733 56.158 89.212 1.00 13.70 ATOM 8 HG1 MET A 1 47.334 55.866 90.163 1.00 13.70 ATOM 9 HG2 MET A 1 47.047 55.782 88.460 1.00 13.70 ATOM 10 SD MET A 1 47.708 57.971 89.163 1.00 13.70

>head MGG-011730-ITASSER.pdb ATOM 1 H LEU 1 -30.724 18.366 -0.112 1.00 4.24 ATOM 2 N LEU 1 -30.717 19.328 -0.332 1.00 4.24 ATOM 3 CA LEU 1 -31.127 20.240 0.732 1.00 4.24 ATOM 4 C LEU 1 -30.216 20.107 1.947 1.00 4.24 ATOM 5 O LEU 1 -29.360 19.226 1.982 1.00 4.24 ATOM 6 CB LEU 1 -32.579 19.966 1.133 1.00 4.24 ATOM 7 CG LEU 1 -33.577 20.301 0.018 1.00 4.24 ATOM 8 CD1 LEU 1 -34.987 19.880 0.429 1.00 4.24 ATOM 9 CD2 LEU 1 -33.575 21.804 -0.259 1.00 4.24 ATOM 10 N PRO 2 -30.291 20.947 3.077 1.00 2.47

> tail AvrPita.pdb ATOM 3585 H CYS A 224 76.364 66.692 47.328 1.00 5.19 ATOM 3586 CA CYS A 224 74.809 66.366 45.907 1.00 5.19 ATOM 3587 HA CYS A 224 74.247 65.434 45.807 1.00 5.19 ATOM 3588 CB CYS A 224 73.982 67.329 46.770 1.00 5.19 ATOM 3589 HB1 CYS A 224 73.157 67.727 46.174 1.00 5.19 ATOM 3590 HB2 CYS A 224 74.606 68.173 47.067 1.00 5.19 ATOM 3591 SG CYS A 224 73.262 66.560 48.246 1.00 5.19 ATOM 3592 C CYS A 224 75.018 66.922 44.489 1.00 5.19 ATOM 3593 O CYS A 224 76.123 67.110 43.980 1.00 5.19 TER

>tail MGG-011730-ITASSER.pdb ATOM 661 OG1 THR 82 2.127 -6.129 7.772 1.00 2.03 ATOM 662 CG2 THR 82 0.326 -5.954 6.201 1.00 2.03 ATOM 663 N PRO 83 0.144 -8.497 9.877 1.00 3.59 ATOM 664 CA PRO 83 0.505 -9.336 10.943 1.00 3.59 ATOM 665 C PRO 83 0.584 -10.638 10.319 1.00 3.59 ATOM 666 O PRO 83 0.347 -10.751 9.103 1.00 3.59 ATOM 667 CB PRO 83 -0.615 -9.288 11.984 1.00 3.59 ATOM 668 CG PRO 83 -1.888 -9.067 11.196 1.00 3.59 ATOM 669 CD PRO 83 -1.811 -9.989 9.990 1.00 3.59 TER

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LPDI-EPFL/masif/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB7GNQT6JTFOCSECDMLR5J3RYEOSJANCNFSM4OGCEBHA .

pablogainza commented 4 years ago

So the problem is that you have no chain name. Very easy fix:

Replace the empty space at column 22 of your pdb file with the letter 'A', for chain A In vim: Open the file and execute: :%s/(^.{21}) /\1A/

Then just run it as: ./data_prepare_one.sh --file /MGG-011730-ITASSER.pdb 1MGG_A