yao8839836 / text_gcn

Graph Convolutional Networks for Text Classification. AAAI 2019
1.36k stars 434 forks source link

Word embedding problems #20

Open yeshenpy opened 5 years ago

yeshenpy commented 5 years ago

I carefully read the code, I found the word embedding is only used to calculate the adjacency matrix, each epoch will calculate all the results,every time the input is always a diag without any word embeddings , through the mask trick to get loss and accuracy. Word embedding in the traditional network , the word embeddings will always be as input, At the same time as the bp process, word embedding will be updated if we want to fine-tune the word embeddings。but the text GCN only uses Diagonal identity matrix as input. word embeddings is only used in data preprocessing。 If you want to update the word embedding, we need the output of GCN as word embedding, reinvoke data preprocessing,Therefore, this part makes me confused and I hope to get your help!

yeshenpy commented 5 years ago

features = sp.identity(features.shape[0]) this code makes the input be a Diagonal identity matrix

yao8839836 commented 5 years ago

@yeshenpy

Yes, your description is correct. I tried to use word embeddings and their average as word/doc node features, but this performs worse than using one-hot features with features = sp.identity(features.shape[0]). So I commented lines in build_graph.py which read word embeddings and initialize the word embeddings dictionary.

In fact, pre-trained word embeddings are never used and Text GCN learns predictive word embeddings as outputs.

I am sorry for this confusion.

yeshenpy commented 5 years ago

Thanks again for your reply, which will be of great help . so if the features are unit matrix , the relationships between the doc and word are completely according to the word frequency and word-doc frequency relationship ? Is that correct?

yao8839836 commented 5 years ago

@yeshenpy Yes, you are right.

tranmanhdat commented 4 years ago

@yeshenpy

Yes, your description is correct. I tried to use word embeddings and their average as word/doc node features, but this performs worse than using one-hot features with features = sp.identity(features.shape[0]). So I commented lines in build_graph.py which read word embeddings and initialize the word embeddings dictionary.

In fact, pre-trained word embeddings are never used and Text GCN learns predictive word embeddings as outputs.

I am sorry for this confusion.

I read your code in build_graph.py and i see your variable word_vector_map is a empty dictionary, so the inittial embedding for words and documents is all 0, or nearly 0 ( from -0.01 to 0.01). does it right?

yao8839836 commented 4 years ago

@tranmanhdat Yes. you are right.