Closed Dangwei-dw closed 1 year ago
PPI and Go
Hi, For PPI and GO dataset, before constructing the graph, data represented by node names (they look like indices as they are integers but they represent node names). Explained it here Readme: https://github.com/sezinata/MANE/tree/master/data/test_data/Alzheimer Conversion of them (after constructing the PPI and GO graphs) to indices is required.
From Node Classification (without attention) line numbers 248 249 node2idx = {n: idx for (idx, n) in enumerate(common_nodes)} ##used for initial mapping to indices idx2node = {idx: n for (idx, n) in enumerate(common_nodes)} ## used in after the model converting back to node names
MANE works on common nodes, this is the reason of this conversion.
I mean, the ID or name of the node in the database, like the ID of the protein in IntAct using Uniprot ID 'uniprotkb:P49418', 'uniprotkb:O43426',
Here is the link https://github.com/sezinata/SurveyDGP for more info on dataset IntAct: "Recent advances in network-based methods for disease gene prediction"
Thanks a lot!
Hi, Can you provide the code for data preprocessing? The ID or name of the node in the dataset