How can I use the existing HinSAGELinkGenerator to predict links of different edge types between different node type pairs? In my example, I've got a knowledge graph consisting of drugs, diseases, proteins, functions and side effects.
To give a better insight into what I am trying to achieve: I would like to predict
which type of link exists between node type drug and node type disease (out of multiple edge types (increasing, decreasing))
which type of link exists between node type drug and node type protein (out of multiple edge types (inhibiting, activating, etc.))
In the documentation, it is said that
one approach to obtain embeddings for all nodes in a heterogeneous graph would be to run this model separately for each node type
Could you elaborate on this? I was wondering whether this means that I would have to train 5 different models (one for each node type of disease, drug, protein, side effect, function) and each model can predict multiple edge types or if I have to train a model for each edge type.
Would I have to split the graph into subgraphs of two node types each? I am asking myself this because I created a StellarGraph object of a heterogeneous graph with 5 node types and then got the following error:
Description
Enable the HinSAGE model to do link prediction on a heterogeneous graph with more than 2 node types and multiple edge types.
User Story
I am trying to predict links in a heterogeneous (custom) graph and want to use the GraphSAGE implementation for heterogeneous graphs (HinSAGE) for this. In the hinsage link prediction tutorial as well as in the documentation for unsupervised node feature learning with HinSAGE, it is stated that only two head node types are accepted.
How can I use the existing HinSAGELinkGenerator to predict links of different edge types between different node type pairs? In my example, I've got a knowledge graph consisting of drugs, diseases, proteins, functions and side effects.
To give a better insight into what I am trying to achieve: I would like to predict
In the documentation, it is said that
Could you elaborate on this? I was wondering whether this means that I would have to train 5 different models (one for each node type of disease, drug, protein, side effect, function) and each model can predict multiple edge types or if I have to train a model for each edge type.
Would I have to split the graph into subgraphs of two node types each? I am asking myself this because I created a StellarGraph object of a heterogeneous graph with 5 node types and then got the following error:
Would be great to get help on this!