Closed SOOJEONGKIMM closed 2 years ago
Introduction Transductive Learning(연역법)_Text Classification Task, GNN approach
BertGCN model successfully combines the powers of large-scale pretraining and graph networks. on wide range of text classification datasets.
Method
*TextGCN: initial node features n_doc: document node, n_word: word node, d: embedding dimension
ith GCN layer's output feature. p: activation function, A~: normalized adjacency matrix, W: weight matrix of the layer. L: input feature matrix of model.
fed to softmax layer for classification. optimize parameters by cross entropy loss over labeled document nodes.
Interpolating BERT and GCN Predictions: lambda parameter: tradeoff between two objectives. =>overcome drawbacks such as gradient vanishing or over-smoothing.
Optimization using Memory Bank
Experiments: Experiments Setups
Main Results:
The Effect of lambda: lambda controls the trade-off between training BertGCN and BERT. model reaches best when lambda=0.7
The Effect of Strategies in Joint Training: result with dataset 20ng
BertGCN strategies
https://aclanthology.org/2021.findings-acl.126.pdf ACL-IJCNLP 2021