Open LangDaoAI opened 3 years ago
Thanks for your interest in this work. There are two main reasons why we use bilstm+bert rather than only bert:
(1) In our intital experiments, on the ATE task, the model using bilstm+bert obtains better performance than the model using only bert on all the four datasets.
(2) On the Target-oriented Opinion Words Extraction (TOWE) task, the model using bilstm+bert obtains better performance on 15res and 16res, while the model using only bert obtains better performance on 14res and 14lap. (These experiments on ATE and TOWE were performed on the datasets released by Knowing What, How and Why: A Near Complete Solution for Aspect-Based Sentiment Analysis, but we corrected the problem of these datasets that an opinion is paired up with one aspect at most).
(3) This experiment of using bilstm+bert or only bert is not performed on the Aspect-Opinion Pair Sentiment Classification (AOPSC) task. However, the AOPSC model where the BiLSTM unit szie is 300 obtains outperforms the the AOPSC model where the BiLSTM unit szie is 768, so, the AOPSC model using only bert may surpass the AOPSC using bilstm+bert.
On the whole, the models using bilstm+bert can not surpass the model using only bert on all subtasks, and vice versa, so we simply select bilstm+bert, which is used more commonly.
Best Regards, Yuncong Li
Thanks for your interest in this work. There are two main reasons why we use bilstm+bert rather than only bert:
- We formulate both Aspect Term Extraction (ATE) and Target-oriented Opinion Words Extraction (TOWE) as sequence labeling problems. The encoder, bilstm+bert, has been widely used in neural sequence labeling models. Adopting bilstm+bert rather than only bert aims to show that the good experimental results are from our contributions rather than the encoder. To make our method more simpler, we also use bilstm+bert for the Aspect-Opinion Pair Sentiment Classification (AOPSC) task.
- We think the models with bilstm+bert or only bert have similar performance on the Aspect Sentiment Opinion Triplet Extraction (ASOTE) task.
(1) In our intital experiments, on the ATE task, the model using bilstm+bert obtains better performance than the model using only bert on all the four datasets.
(2) On the Target-oriented Opinion Words Extraction (TOWE) task, the model using bilstm+bert obtains better performance on 15res and 16res, while the model using only bert obtains better performance on 14res and 14lap. (These experiments on ATE and TOWE were performed on the datasets released by Knowing What, How and Why: A Near Complete Solution for Aspect-Based Sentiment Analysis, but we corrected the problem of these datasets that an opinion is paired up with one aspect at most).
(3) This experiment of using bilstm+bert or only bert is not performed on the Aspect-Opinion Pair Sentiment Classification (AOPSC) task. However, the AOPSC model where the BiLSTM unit szie is 300 obtains outperforms the the AOPSC model where the BiLSTM unit szie is 768, so, the AOPSC model using only bert may surpass the AOPSC using bilstm+bert.
On the whole, the models using bilstm+bert can not surpass the model using only bert on all subtasks, and vice versa, so we simply select bilstm+bert, which is used more commonly.
Best Regards, Yuncong Li
Thanks detailed explanation, Li !
Hi,
Firstly, thanks for opening the code! Secondly, I want to know why using bilstm+bert rathan than only using bert?
Thanks!