tiantz17 / PocketAnchor

Learning Structure-based Pocket Representations for Protein-Ligand Interaction Prediction
Apache License 2.0
29 stars 6 forks source link

What is atom_in_aa? #2

Closed hi7049 closed 1 year ago

hi7049 commented 1 year ago

Hi, congratulate to your great work.

I found in the PDBbase.py file, there is a line that: atom_in_aa = torch.zeros(num_atom, 93) What is the meaning of atom_in_aa? Do you get it by pymol? Is it possible to kindly share your feature engineering codes in pymol?

tiantz17 commented 1 year ago

Hi @hi7049, thanks for your interest.

The codes for data preprocessing, model training, and evaluation will be released in a few days.

The "atom_in_aa" denotes the one-hot encoding of the atom type w.r.t. amino acid type:

AA_TO_ATOM = {'GLY': ['OXT', 'C', 'N', 'O', 'CA'],
              'LEU': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD1', 'CD2'],
              'ALA': ['OXT', 'C', 'N', 'O', 'CA', 'CB'],
              'HIS': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD2', 'CE1', 'ND1', 'NE2'],
              'PHE': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD1', 'CD2', 'CE1', 'CE2', 'CZ'],
              'TRP': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD1', 'CD2', 'CE2', 'CE3', 'CH2', 'CZ2', 'CZ3', 'NE1'],
              'TYR': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD1', 'CD2', 'CE1', 'CE2', 'CZ', 'OH'],
              'ASN': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'ND2', 'OD1'],
              'VAL': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG1', 'CG2'],
              'GLU': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD', 'OE1', 'OE2'],
              'SER': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'OG'],
              'ASP': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'OD1', 'OD2'],
              'PRO': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD'],
              'MET': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CE', 'SD'],
              'ILE': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG1', 'CG2', 'CD1'],
              'THR': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG2', 'OG1'],
              'LYS': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD', 'CE', 'NZ'],
              'ARG': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD', 'CZ', 'NE', 'NH1', 'NH2'], 
              'GLN': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'CG', 'CD', 'NE2', 'OE1'], 
              'CYS': ['OXT', 'C', 'N', 'O', 'CA', 'CB', 'SG'],
              'HETATM': ['X']}
SHARED_ATOMS = ['OXT', 'C', 'N', 'O', 'CA']
AA_ATOM_LIST = []
for aa, atom_list in AA_TO_ATOM.items():
    for atom in atom_list:
        if atom not in SHARED_ATOMS:
            AA_ATOM_LIST.append(aa+'_'+atom)
AA_ATOM_LIST = SHARED_ATOMS + AA_ATOM_LIST
ljpadam commented 1 year ago

Thanks for your reply. Look forward to your further released codes.