yao8839836 / text_gcn

Graph Convolutional Networks for Text Classification. AAAI 2019
1.35k stars 434 forks source link

featureless是什么意思? #115

Open shaoyangxu opened 3 years ago

shaoyangxu commented 3 years ago

您好,看了你的代码,大致理解了你的思路: 如果doc和word都是one-hot向量,那么x(feature)就是一个单位阵,所以我看到的第一层GCN设置的是featureless=True,即x并不需要参与计算;而第二层的x就不是单位阵了,而是第一层的 激活层(输出),所以x参与了计算。 你这么设置的原因是不是就是说:其实一开始的x就可以是一个稠密的矩阵,就比如word都用预训练的词向量,doc也有自己的向量表示,比如可以是句内所有word向量的平均,或者.... 问题就是:

  1. 上面我理解的对吗?
  2. 有尝试过预训练的词向量嘛,效果如何?
yao8839836 commented 3 years ago

@beiweixiaoxu

  1. 对。
  2. 试过用Glove词向量作为word节点特征,文档中词向量的平均作为doc节点特征。效果不太好。
shaoyangxu commented 3 years ago

@beiweixiaoxu

  1. 对。
  2. 试过用Glove词向量作为word节点特征,文档中词向量的平均作为doc节点特征。效果不太好。

哦好的,感谢回复。

cocoiit commented 1 year ago

Hi, I am working on the tweet classification related to Healthcare. Actually, I tried BioBERT pre-trained embeddings for the vocab and their average as doc embeddings. as most of the embeddings are negative (<= 0), the activation functions like RELU/Leaky_RELU/ELU are making most of the embeddings as 0. This is making the learned embedding vectors sparse and hence very hard to distinguish between classes later on. Is there any way I can utilize pre-trained embeddings of some type? Any suggestion would be appreciated!