graph4ai / graph4nlp

Graph4nlp is the library for the easy use of Graph Neural Networks for NLP. Welcome to visit our DLG4NLP website (https://dlg4nlp.github.io/index.html) for various learning resources!
Apache License 2.0
1.67k stars 200 forks source link

Difference between heterogeneous and as_node #529

Closed nashid closed 2 years ago

nashid commented 2 years ago

❓ Questions and Help

I am implementing a heterogeneous graph.

I see following section in the config file:

  graph_construction_private:
    edge_strategy: 'heterogeneous'
    merge_strategy: 'tailhead'
    sequential_link: true
    as_node: false

If in the config file, if I set as_node: true and then set edge_attribute using the following code snippet, is it enough to implement the as_node graph?

ret_graph.edge_attributes[edge_idx]["token"] = "some-token"
AlanSwift commented 2 years ago

Note: The heterogeneous GNN is not supported. Currently, it is just stored in the graph data. The edge information will be dropped in the pipeline. But as_node is supported. For example, node A --edge type C--> node B (node A and node B are connected with edge type C). For the heterogeneous graph type, edge type C is stored in edge. For as_node, edge C is converted into a new node with node value C. And the graph is converted to a levi-graph. Please refer to our paper: Graph Neural Networks for Natural Language Processing: A Survey for more details.

nashid commented 2 years ago

@AlanSwift thanks for the pointers. But one thing that is not clear to me is how to create the graph with as_node? Is the following code creating a graph with as-node?

Lets say nodeA is connected to nodeB with an edge with has token.

nodeA --(edge: has)--> nodeB

And I do:

ret_graph.edge_attributes[edge_idx]["token"] = "has"

And I set as_node: true in the config:

  graph_construction_private:
    edge_strategy: 'heterogeneous'
    merge_strategy: 'tailhead'
    sequential_link: true
    as_node: true

Is it creating the as_node graph?

(Or do I have to implement it manually?)

AlanSwift commented 2 years ago

No. Please refer to the docs for detailed introduction.

nashid commented 2 years ago

@AlanSwift thanks for the pointers. I see:

    elif edge_strategy == "as_node":
        # insert a node
        node_idx = ret_graph.get_node_num()
        ret_graph.add_nodes(1)
        ret_graph.node_attributes[node_idx]['type'] = 3  # 3 for edge node
        ret_graph.node_attributes[node_idx]['token'] = dep_info['edge_type']
        ret_graph.node_attributes[node_idx]['position_id'] = None
        ret_graph.node_attributes[node_idx]['head'] = False
        ret_graph.node_attributes[node_idx]['tail'] = False
        # add edge infos
        ret_graph.add_edge(dep_info['src'], node_idx)
        ret_graph.add_edge(node_idx, dep_info['tgt'])

This is essentially creating a node for edge type 3 and setting token as edge_type for the node. And then adding this new node in between src and tgt.

In my case, I am building a graph from scratch. So I presume the only important bit is to create a new node for my edges and to set the edge_type as token. Am I correct?

I don't need to really set the type, position_id, head, tail for node_attributes. As long as I set token, I should be fine.

Please correct me in case if I misunderstand anything here.

Thanks for the clarification.

AlanSwift commented 2 years ago

Yes. The type is not the exact edge type. This is just for easy implementation. The heterogeneous graph is not supported now. (We will support R-GNN soon.)