yao8839836 / text_gcn

Graph Convolutional Networks for Text Classification. AAAI 2019
1.36k stars 434 forks source link

How to use saved model and weights for prediction? #13

Closed monk1337 closed 4 years ago

monk1337 commented 5 years ago

Hi, I have trained model but I am not able to use trained weights and model for prediction( inference ) time.

Can you provide simple example for that?

Thank you !

yao8839836 commented 5 years ago

@monk1337

Hi, thanks for the question. And I am sorry for the delay. I was traveling in China recently.

The GCN used in this project is inherently transductive. All nodes are included in training and all weights have been learned after training.

You can see how to save and restore models at: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/4_Utils/save_restore_model.py

Basically you can focus on these lines:


saver = tf.train.Saver()

save_path = saver.save(sess, model_path) print("Model saved in file: %s" % save_path)

saver.restore(sess, model_path) print("Model restored from file: %s" % save_path)


You can put these lines on proper positions of https://github.com/yao8839836/text_gcn/blob/master/train.py

Lescurel commented 5 years ago

Hi, thank you for your work which I find highly interesting,

I have a following question. The input of the model, if I'm not mistaken are the following :

I'd like to know how to make a prediction on brand new data : how to represent the data to predict into an input feature X ? Furthermore, do I need to recompute a new graph A which take into account the new data ? If so, I don't see how we can use this model to take into account the new data without retraining it .

I hope you can enlighten me on this matter.

yao8839836 commented 5 years ago

@Lescurel

Hi, thanks for your interests.

You are right, the input of the model is the same as your description. The test nodes (brand new data without labels) are included in the training.

The current model could not make a prediction on brand new data without retraining, because the GCN we use is transductive. We have pointed this out in "Discussion" of our paper. There are some inductive GCN variants which can make prediction on brand new data without retraining:

[1] Hamilton, W.; Ying, R.; and Leskovec, J. 2017. Inductive representation learning on large graphs. In NIPS, 1024–1034.

[2] Chen, J.; Ma, T.; and Xiao, C. 2018. Fastgcn: Fast learning with graph convolutional networks via importance sampling. In ICLR

We tried their code, but it seems they don't work well with one-hot features. We are also trying to solve this problem in our own way. A possible simple solution is building the graph without test docs (or even without all docs), when a new doc (say d_100) comes, we lookup word embeddings (for words in d_100) learned by GCN and do some pooling (mean, average, LSTM) to generate doc embeddings for d_100, we can then select the dimension with the highest value in the doc embedding as the prediction label.

Lescurel commented 5 years ago

Thank you, I will look into it.

yao8839836 commented 5 years ago

@Lescurel

Hi, I have found an inductive manner to train Text GCN, which can make prediction on brand new data without retraining, I used a two layers​ approximation version of fastGCN [1]:

https://github.com/matenure/FastGCN/blob/master/pubmed_inductive_appr2layers.py

This inductive GCN version also supports mini-batch. The test accuracy for 20NG is about 0.80 with rank0 =100, rank1 =100, lower than 0.8634 produced by our transductive Text GCN.

[1] Chen, J.; Ma, T.; and Xiao, C. 2018. Fastgcn: Fast learning with graph convolutional networks via importance sampling. In ICLR