Closed monk1337 closed 4 years ago
@monk1337
Hi, thanks for the question. And I am sorry for the delay. I was traveling in China recently.
The GCN used in this project is inherently transductive. All nodes are included in training and all weights have been learned after training.
You can see how to save and restore models at: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/4_Utils/save_restore_model.py
Basically you can focus on these lines:
saver = tf.train.Saver()
save_path = saver.save(sess, model_path) print("Model saved in file: %s" % save_path)
saver.restore(sess, model_path) print("Model restored from file: %s" % save_path)
You can put these lines on proper positions of https://github.com/yao8839836/text_gcn/blob/master/train.py
Hi, thank you for your work which I find highly interesting,
I have a following question. The input of the model, if I'm not mistaken are the following :
number of doc + size of the vocabulary.
I'd like to know how to make a prediction on brand new data : how to represent the data to predict into an input feature X ? Furthermore, do I need to recompute a new graph A which take into account the new data ? If so, I don't see how we can use this model to take into account the new data without retraining it .
I hope you can enlighten me on this matter.
@Lescurel
Hi, thanks for your interests.
You are right, the input of the model is the same as your description. The test nodes (brand new data without labels) are included in the training.
The current model could not make a prediction on brand new data without retraining, because the GCN we use is transductive. We have pointed this out in "Discussion" of our paper. There are some inductive GCN variants which can make prediction on brand new data without retraining:
[1] Hamilton, W.; Ying, R.; and Leskovec, J. 2017. Inductive representation learning on large graphs. In NIPS, 1024–1034.
[2] Chen, J.; Ma, T.; and Xiao, C. 2018. Fastgcn: Fast learning with graph convolutional networks via importance sampling. In ICLR
We tried their code, but it seems they don't work well with one-hot features. We are also trying to solve this problem in our own way. A possible simple solution is building the graph without test docs (or even without all docs), when a new doc (say d_100) comes, we lookup word embeddings (for words in d_100) learned by GCN and do some pooling (mean, average, LSTM) to generate doc embeddings for d_100, we can then select the dimension with the highest value in the doc embedding as the prediction label.
Thank you, I will look into it.
@Lescurel
Hi, I have found an inductive manner to train Text GCN, which can make prediction on brand new data without retraining, I used a two layers approximation version of fastGCN [1]:
https://github.com/matenure/FastGCN/blob/master/pubmed_inductive_appr2layers.py
This inductive GCN version also supports mini-batch. The test accuracy for 20NG is about 0.80 with rank0 =100, rank1 =100, lower than 0.8634 produced by our transductive Text GCN.
[1] Chen, J.; Ma, T.; and Xiao, C. 2018. Fastgcn: Fast learning with graph convolutional networks via importance sampling. In ICLR
Hi, I have trained model but I am not able to use trained weights and model for prediction( inference ) time.
Can you provide simple example for that?
Thank you !