acbull / pyHGT

Code for "Heterogeneous Graph Transformer" (WWW'20), which is based on pytorch_geometric
MIT License
775 stars 162 forks source link

HGT Implementation on a different heterogeneous graph #30

Open harshsarda opened 3 years ago

harshsarda commented 3 years ago

Hi. Thanks for sharing your code. I wanted to implement the HGT model on my heterogeneous graph but I don't have any temporal element in my data. From what I could understand, I would have to modify the function sample_subgraph as it uses the feature_OAG extractor on line 176 in data.py. Please find it below:

feature, times, indxs, texts = feature_extractor(layer_data, graph)

Can you provide me some idea on how I can modify it and also the "to_torch" function so that I can use the HGT model for my use case or in general, any heterogeneous graph?

acbull commented 3 years ago

Hi. (1) If you don't have temporal data, it's fine, you can simply set all the time as 0, and set use_RTE to be False; (2) The feature_extractor is in feature_extractor is in utils.py. If you don't have complex features, you can refer to this one: https://github.com/acbull/pyHGT/blob/fa0ec296ca71a19f8c68277038371b4068a37667/ogbn-mag/pyHGT/utils.py#L93 as a reference.

harshsarda commented 3 years ago

So, basically for my use case, I need to use feature_MAG instead of feature_OAG as the feature extractor because feature_MAG is a generalized function unlike feature_OAG, which is specific to the OAG data. Thank you so much for your prompt response @acbull

acbull commented 3 years ago

feature_MAG is a more simple one (which just assume all nodes have same type of feature). If not, then you can refer to feature_OAG about adding some padding.

To preprocess your heterogeneous graph data at first place, you can refer to: \

https://github.com/acbull/pyHGT/blob/fa0ec296ca71a19f8c68277038371b4068a37667/ogbn-mag/preprocess_ogbn_mag.py

and

https://github.com/acbull/pyHGT/blob/fa0ec296ca71a19f8c68277038371b4068a37667/OAG/preprocess_OAG.py

Also, it's highly recommended that you can use our pre-training framework to train the HGT using reconstruction loss as in: https://github.com/acbull/GPT-GNN

Please let me know if you have any further questions.

harshsarda commented 3 years ago

Oh okay. I do have nodes with different types of features, so, I will have to refer to feature_OAG to add padding to the attributes.

Thank you so much. I will try following these steps and will let you know in case of any questions.