liu-yushan / PoLo

Apache License 2.0
28 stars 12 forks source link

How to calculate the rule confidence of the Metapaths? #4

Closed LuMflowers closed 1 year ago

LuMflowers commented 2 years ago

Hi, could you share the code of calculating the rule confidence of the Metapaths? I didn't get the right rule confidence according to your desciption.

liu-yushan commented 2 years ago

Hi, do you mean the metapaths in Table 2? Since the confidence estimation is based on sampling, the confidences might vary slightly. Some relations are reflexive, e.g., CrC means the same as _CrC, and there are different formulations of one rule, e.g., ["CtD", "CrC", "CtD"] represents the same rule as ["CtD", "_CrC", "CtD"]. In such cases, we estimate the confidence for each representation separately and take the average. All equivalent formulations are also included in datasets/Hetionet/rules.txt.

liu-yushan commented 2 years ago

We also estimate the confidence of both the rule and the inverse of the rule, e.g., ["Compound", "CpD", "Disease"] and ["Disease", "_CpD", "Compound"] and take the average to get a better estimate.

liu-yushan commented 2 years ago

The script for calculating the confidences can be found here: https://github.com/liu-yushan/PoLo/tree/main/datasets/Hetionet/preprocessing.

LuMflowers commented 1 year ago

The script for calculating the confidences can be found here: https://github.com/liu-yushan/PoLo/tree/main/datasets/Hetionet/preprocessing.

Thank you for your code. But I still have a question about the files in ../datasets/Hetionet/preprocessing/, how can I get these files such as node_edges.json and metapath_p3.json.

liu-yushan commented 1 year ago

node_edges.json is a dictionary that lists all nodes' neighbors, grouped by relations and node types. You can just go through all triples and add the corresponding information to the dictionary. metapath_p3.json lists all possible metapaths between compounds and diseases up to length 3. This is based on the metagraph (see https://het.io/about/).