pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.1k stars 3.63k forks source link

`VarMisuse` Dataset + Examples #3693

Open realCrush opened 2 years ago

realCrush commented 2 years ago

🚀 Feature

Motivation

paper "LEARNING TO REPRESENT PROGRAMS WITH GRAPHS" which encode computer programs as graphs, with rich semantic information, however, most code implementation on this dataset VarMisuse is based on TensorFlow, like tf-gnn-samples

Pitch

include VarMisuse dataset in PyG framework as a class under torch_geometric.datasets

Additional context

raw dataset provided by author: VarMisuse

more papers/models on this dataset: GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation ON THE BOTTLENECK OF GRAPH NEURAL NETWORKS AND ITS PRACTICAL IMPLICATIONS

rusty1s commented 2 years ago

I think this is a great suggestion :) If you do not have any plans to look into this, I can try to add this dataset as well.

realCrush commented 2 years ago

I think this is a great suggestion :) If you do not have any plans to look into this, I can try to add this dataset as well.

sure, please have a try🥳

realCrush commented 2 years ago

I think this is a great suggestion :) If you do not have any plans to look into this, I can try to add this dataset as well.

By the way, I would be super appreciate if you could provide some model training examples on VarMisuse with PyG models, such as GNN-FiLM!😋

arv-77 commented 2 years ago

If no one is working on this, can I take it up?

realCrush commented 2 years ago

If no one is working on this, can I take it up?

Sure! Just do it.

amitamb commented 2 years ago

@Cyberpuncrush I have implemented it but some things need to be done

https://github.com/pyg-team/pytorch_geometric/pull/3978

  1. I could not host all files as they are quite large
  2. I am not sure I should use hetero dataset but I implemented it pretty similar to entities dataset
  3. I haven't gotten around to writing example yet so haven't tested it yet.

Let me know if I am on right track.

@arvindmuralie77 You can improve on this implementation if you wish.

realCrush commented 1 year ago

Any update on this dataset? Long range problems on graphs is hot in recent days, so this real and large scale dataset would be great to probe the performance of GNNs.

rusty1s commented 1 year ago

The PR slipped under my radar, let me try to revisit it. Sorry for any delay.

tsadigovAgmail commented 7 months ago

Is it still in progress?