LehighInfolab / Infolab-Utils

GNU General Public License v3.0
0 stars 2 forks source link

Create a json file of the interfacial hydrogen bonds for all PPIs #1

Open G-Armstrong opened 1 year ago

G-Armstrong commented 1 year ago

The graph_data_set.ipynb notebook includes code that can identify the ontological classification of all hydrogen bonds present at the interface between two interacting protein halves in the SKEMPIv2 data set. The notebook has a function called pretty_print() that accepts a dictionary called hashmap as input. The keys in hashmap represent acceptor and donor cones of protein half 1, while the values for each key represent those acceptors and donor cones in protein half 2 that intersect and face the cones of half 1.

Hydrogen bonds can only form between acceptor and donor pairs that are both oriented towards one another and fall within the 4.6A hydrogen bond cutoff (i.e. 2 * 2.3A = 4.6A). Luckily, the hashmap input already contains the acceptor/donor pairs that meet these criteria, but it also contains acceptor-acceptor and donor-donor entries and k,v pairs that could not possibly interact. Therefore, the pretty_print( ) function filters down hashmap for only acceptor-donor and donor-acceptor pairs and appends them to a new dictionary called directed_graph after printing them out neatly to the console.

The task is to understand how this code works, and build a json file that can be queried for the network of hydrogen bonds present at the interface of any given wild type (WT) or mutant type (MT) pdb. The json file should take a nested dictionary format:

Image

G-Armstrong commented 1 year ago

Image

The data within the directed_graph looks like this for each workIndex. Notice the DONOR_HYDRO term for each donor entry. This term describes the hydrogen atom participating in the hydrogen bond. DONOR, on the other hand, is the electronegative antecedent atom that the DONOR_HYDRO atom is covalently bound to.