Degiacomi-Lab / molearn

protein conformational spaces meet machine learning
https://degiacomi.org/software/molearn/
GNU General Public License v3.0
40 stars 11 forks source link

Could not recognize the alanine dipeptide molecule because of ALA and NME residues #1

Closed anandojha closed 2 years ago

anandojha commented 3 years ago

Here is the traceback Traceback (most recent call last): File "example_script.py", line 36, in lf = Auto_potential(frame=dataset[0]*stdval, pdb_atom_names=atom_names, method = method, device=torch.device('cpu')) File "/home/aaojha/molearn/molearn/loss_functions.py", line 54, in init self._roll_init(frame, pdb_atom_names, NB=NB, fix_h=fix_h,alt_vdw=alt_vdw) File "/home/aaojha/molearn/molearn/loss_functions.py", line 78, in _roll_init q1q2, q1q2_14 )=get_convolutions(frame, pdb_atom_names, fix_slice_method=True, fix_h=fix_h,alt_vdw=alt_vdw) File "/home/aaojha/molearn/molearn/protein_handler.py", line 621, in get_convolutions atom_names = [[amber_atoms[res][atom],res] for atom, res in pdb_atom_names ] File "/home/aaojha/molearn/molearn/protein_handler.py", line 621, in atom_names = [[amber_atoms[res][atom],res] for atom, res in pdb_atom_names ] KeyError: 'ACE'

This is the PDB :+1: ATOM 1 H1 ACE A 1 -9.671 17.403 2.765 1.00 0.00 H
ATOM 2 CH3 ACE A 1 -8.609 17.551 2.964 1.00 0.00 C
ATOM 3 H2 ACE A 1 -8.541 18.379 3.670 1.00 0.00 H
ATOM 4 H3 ACE A 1 -8.085 17.733 2.026 1.00 0.00 H
ATOM 5 C ACE A 1 -8.032 16.365 3.686 1.00 0.00 C
ATOM 6 O ACE A 1 -6.896 15.996 3.561 1.00 0.00 O
ATOM 7 N ALA A 2 -8.820 15.879 4.572 1.00 0.00 N
ATOM 8 H ALA A 2 -9.788 16.165 4.539 1.00 0.00 H
ATOM 9 CA ALA A 2 -8.352 14.859 5.597 1.00 0.00 C
ATOM 10 HA ALA A 2 -7.439 15.155 6.113 1.00 0.00 H
ATOM 11 CB ALA A 2 -9.422 14.633 6.715 1.00 0.00 C
ATOM 12 HB1 ALA A 2 -9.734 15.562 7.191 1.00 0.00 H
ATOM 13 HB2 ALA A 2 -10.313 14.399 6.133 1.00 0.00 H
ATOM 14 HB3 ALA A 2 -9.187 13.850 7.437 1.00 0.00 H
ATOM 15 C ALA A 2 -8.027 13.499 4.921 1.00 0.00 C
ATOM 16 O ALA A 2 -8.577 13.191 3.850 1.00 0.00 O
ATOM 17 N NME A 3 -7.119 12.707 5.486 1.00 0.00 N
ATOM 18 H NME A 3 -6.626 13.089 6.280 1.00 0.00 H
ATOM 19 C NME A 3 -6.638 11.489 4.890 1.00 0.00 C
ATOM 20 H1 NME A 3 -7.415 10.725 4.897 1.00 0.00 H
ATOM 21 H2 NME A 3 -6.367 11.721 3.860 1.00 0.00 H
ATOM 22 H3 NME A 3 -5.779 11.148 5.468 1.00 0.00 H
TER 23 NME A 3 END

degiacom commented 2 years ago

molearn has not been explicitly designed to handle such short chains (the smallest case we have currently benchmarked it with is alpha-B crystallin). As our aim was to handle proteins with hundreds of amino acids, we trained the neural network by only using a subset of atoms (CA, C, CB, N, O). For this reason, termini parameters are not imported.

We read amino acid parameters from the standard Amber files amino12.lib, parm10.dat and frcmod.ff14SB. Terminal residues are in files aminoct12.lib (C-ter) and aminont12.lib (N-ter). The easiest option to bypass this problem is to add the required parameters to amino12.lib, parm12.dat, or frcmod.ff14SB. Work to accommodate all-atom representations is currently ongoing.