tensorflow / neural-structured-learning

Training neural models with structured signals.
https://www.tensorflow.org/neural_structured_learning
Apache License 2.0
980 stars 189 forks source link

gnn model implementation #61

Closed joshchang1112 closed 3 years ago

joshchang1112 commented 4 years ago

Hi,

I’m interested in neural structure learning(graph neural network) and want to apply some basic gnn models such as GCN, GAT using Tenserflow2.0. I have some ideas on how to implement it, but I am not sure whether such a model architecture is clear. Here is the code structure of gnn I roughly thought:

About the above structure, I have slightly referred to the repository of Deep Graph Library(dgl). As mentioned before, I want to apply some gnn model application but don’t know if my idea is appropriate. My initial thought is maybe we can create a folder just called gnn-survey-paper or somewhere else to put these implementations of gnn models. This is the first time I tried to post an issue and hope to contribute to the open-source code. If there is anything unclear above or there are some recommendations from your team, just feel free to let me know. Thanks:)

Best regards, Josh

ppham27 commented 4 years ago

Hi Josh,

Thanks for bringing this up. We've also been thinking a lot about GNNs and were thinking of building upon https://github.com/deepmind/graph_nets.

This gives you two options, I think.

  1. You're free to start a research/gnn-survey-paper directory that we're happy to review. We would be very curious about what your use cases are for GNNs. With this option, you obviously have much more freedom and can move much faster. The drawback is that while we can review your code, there's not much guarantee of any additional support on our end. If we do find your implementation fits with our long-term goals, that may change, though.

  2. I hope to push out an experimental prototype using Graph Nets sometime early next week. It would be great if you could try it out, and we could evolve the design based on your feedback. The main drawback of this option is that it will force you to move slower.

What do you think?

joshchang1112 commented 4 years ago

Hi,

Thanks for your reply:) I’m very excited to try the prototype designed by your team about using Graph Nets. I’ll look up relevant information and the github repository about Graph Net Library these days. However, I also wonder if it is possible that these two options can be carried out simultaneously. I can implement some simple gnn models or just functions in these few days and submit a pull request to let you review my code. Maybe it can be combined with the ideas in the future. If not, it will at least makes me more familiar with the architecture of the gnn models. I think it’s all very good, but everything is still based on what your team thinks is better.

By the way, I’ve tried the common dataset(Cora, citeseer) now but I also see the new graph dataset called "wordnet" in tfds library recently, it’s very cool and convenient to use it without any preprocessing. Hope the Cora and siteseer dataset will be added into the tfds library soon in the future:)

Best regards, Josh

ppham27 commented 4 years ago

Hi Josh,

I actually contributed the wordnet dataset. Glad you're finding it useful. Cora was next on our list to add actually but I'm somewhat lacking on bandwidth right now.

Sure doing both in parallel sounds okay to me. I just don't want to feel like you're putting in work that may or may not be used in the future. But if you are fine with your work being considered experimental, feel free to send PRs.

joshchang1112 commented 4 years ago

Hi Phil,

Very glad to hear that you contributed the wordnet dataset and excited to see the cora in tfds soon!!!

By the way, I encountered some personal issues in the past few days and I'll go on a four-day trip this week, so I may not send PR about the first options we discussed last week and therefore I'll wait for the prototype your team designed about Graph Nets. We can discuss how I can provide my feedback about the design next week. Thanks:)

Best regards, Josh

joshchang1112 commented 4 years ago

Hi Phil,

I recently implemented the GCN model with Cora using Tensorflow 2.0. I'm a little worried that the length of the code will be too long so I have not dealt with some details, such as weight initialize, bias, and some error detection. I will add them to the code in the future if the model structure is available. I'll send PR later, and thanks for reviewing my code.

Best regards, Josh

ppham27 commented 4 years ago

Hi Josh,

Sorry it took so long to review your implementation. Some changes and renamings had to be made that are particular to Google. I took the liberty of being less strict to removing the feature_dim argument since it can be inferred.

Thanks for your contribution.

joshchang1112 commented 4 years ago

Hi Phil,

Very glad that I can contribute my gnn implementation to Google, and thanks for reviewing my code. I would like to apply the Graph Attention Network next and hope to make the folder gnn-survey more complete and flexible. I think that it will be finished next week and I'll send PR when I finish it. Thanks:)

Best regards, Josh

joshchang1112 commented 4 years ago

Hi Phil,

Sorry for not getting back to you earlier, and I have some questions about sparse version of graph neural network.

First, I found that if I use tf.sparse.SparseTensor to implement sparse version of gcn, it will output error below when loading the best model to test:

AttributeError: 'SparseTensorSpec' object has no attribute 'name'

I see the document about TensorSpec and SparseTensorSpec on version tensorflow2.3, and found that SparseTensorSpec do not have the attribute name. And exactly, the code in keras/saving/saved_model/load.py just return the tensor_spec.TensorSpec so that SparseTensor may raise error.

def common_spec(x, y):
return tensor_spec.TensorSpec(defun.common_shape(x.shape, y.shape),
                                  x.dtype, x.name)

I just removed x.name in common_spec function now and it can run successfully, but i would like to know that whether I have some typo or wrong structure in my code so that it will raise error.

Second, I currently write the gat sparse version code but i'm not sure how the matrix multiply in the sparse version. I reference this github repo (https://github.com/Diego999/pyGAT/blob/master/layers.py) and I found that they wrote a customized function to do this. If i use the concept of their implementation, will it be ok? or there are other better ways or repo to know how to implement it. Thank you very much:)

Best regards, Josh

joshchang1112 commented 3 years ago

Hi Phil,

I just finished the sparse version of graph attention network, and I'm still a little bit not sure about my implementation. But I'll send pr first, and if my structure needs to modify, just let me know. Also, I would like to know whether i could add the document about how to execute my code correctly just like README, this may let other people more clearly about my architecture and parameter settings.

By the way, did you see the error I mentioned before? Just feel free to check if you are available. Thanks:)

Best regards, Josh

ppham27 commented 3 years ago

Hi Josh sorry for the delayed reply. I must have have missed it while on vacation. Thanks for your work on this.

That looks like a Keras bug to me. I'll follow up with that team to see how we might fix it.

Yes feel free to include a README in the gnn-survey directory.

ppham27 commented 3 years ago

Hi Phil,

Sorry for not getting back to you earlier, and I have some questions about sparse version of graph neural network.

First, I found that if I use tf.sparse.SparseTensor to implement sparse version of gcn, it will output error below when loading the best model to test:

AttributeError: 'SparseTensorSpec' object has no attribute 'name'

Hi @joshchang1112 ,

I was able to check in https://github.com/tensorflow/tensorflow/commit/42c6cc5780159dc3a536757cce4b182a6be35448, so that error should be fixed in tf-nighty tomorrow.

joshchang1112 commented 3 years ago

Hi Phil,

Very glad to hear that! Thank you very much:)

joshchang1112 commented 3 years ago

Hi Phil,

I recently want to implement graphsage and graph isomorphism network(GIN) on the cora dataset, but I'm not sure that my structure and parameters are correct. The original paper in these two gnn models not use cora to conduct their experiments, so I just random select some parameters and tune them to ensure they have similar accuracy with GCN. I send PR with graph isomorphism network implementation first, and if I have some typo or misunderstanding about model structure, just feel free to let me know:) Thanks~~~

Best Regards, Josh

joshchang1112 commented 3 years ago

Hi Phil,

I'd like to add other datasets (Citeseer, Pubmed) and make them available to the gnn models recently, and I want to know if you have any other recommendations about this work?

Besides, I did my undergraduate research about NLP, and I am interested in whether gnn models can apply to some transformer-based models(ex: BERT, RoBERTa). I believe that it will significantly improve some NLP tasks!! But I do not have many experiences in the field, such as GraphBERT or other relevant models. I'd like to ask if you have some ideas or advice on applying GNN to the most popular NLP models. Or if there are some works or datasets I can reference? Thank you very much:)

By the way, I found that the README in gnn-survey have some typo in Code Usage and Reference. I fixed it in my past PR, but it seems not working because of some problems. Therefore, can you help me fix that or tell me how I can modify the code. Thanks~

Best Regards, Josh

ppham27 commented 3 years ago

Hi Josh,

I think the most natural way to apply transformers with GNNs would be to embed each example with BERT and then run a GNN on top of that. I think you run into the problem that you won't have enough memory to back-propagate if you embed the entire graph, though.

Pretrained models often work fine with small batch sizes, so you could look into sampling the graph with techniques like GraphSAGE.

Another approach would be to only embed a small number of the nodes and use a lookup table for the rest. @chunta-lu is working on some infrastructure to do this, where a separate job would update the table asynchronously with batch inference.

In any case, I don't think it is easy and is very much an open research problem!

joshchang1112 commented 3 years ago

Hi Phil,

I'm so glad to hear your advice, and I'll try the sampling techniques first to see whether it could work in my task:) Although it is indeed a difficult problem, I'm excited to conduct relevant research in this area! Thank you so much for giving me advice!

Best Regards, Josh