RAISEDAL / RAISEReadingList

This repository contains a reading list of Software Engineering papers and articles!
0 stars 0 forks source link

Paper Review: Sentiment Analysis for Software Engineering: How Far Can Pre-trained Transformer Models Go? #40

Open wahid-shuvo opened 2 years ago

wahid-shuvo commented 2 years ago

Publisher

2020 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Link to The Paper

http://www.mysmu.edu/faculty/lxjiang/papers/icsme20SA4SE.pdf

Name of The Authors

Ting Zhang, Bowen Xu∗, Ferdian Thung, Stefanus Agus Haryono, David Lo, Lingxiao Jiang

Year of Publication

2020

Summary

In this paper, the authors conduct an extensive comparative study on the performance of existing sentiment detection tools and pre-trained transformer models (PTM) for the software engineering domain. In particular, they compare the performance of Stanford CoreNLP, SentiStrength, SentiStrength-SE, SentiCR and Senti$SD with BERT, RoBERTa, XLNet and ALBERT after evaluating them on six popular datasets of software engineering. They follow the same train test split i.e., 70% for train and 30% for the test from the work of Nicole Novelie and train the pre-trained model further to get a fine-tuned model. For the existing sentiment analysis tool, they only re-train the SentiCR and assess its performance for the comparative analysis. They also follow the default parameter settings for both existing sentiment analysis tools and transformer-based models. They compare the performance of these tools using Macro-avg and Micro-avg F1 evaluation metrics. Among the prior sentiment analysis tool, this work found that SnetiCR performs best on five out of six datasets except for Stak Overflow whereas Stanford CoreNLP performs worst. Among the PTM group, RoBERTa achieves the highest performance on four datasets Where Albert performs the worst. Interestingly, they notice that the pre-trained transformer-based models can outperform the existing sentiment analysis tool by 6.5% to 35.6% in terms of the selected evaluation metrics. Furthermore, they also assess the efficiency of the PTM based sentiment analysis tools. They found that training (fine-tuning) is more expensive than prediction. The time cost for fine-tuning the Transformer models ranges from 15 seconds to 10 minutes, depending on the datasets used. In terms of prediction time, all approaches make predictions for up to hundreds of text units (documents) within seconds. The Transformer models cost less than 50% of Senti4SD and Stanford CoreNLP to make predictions but cost two times more than the time needed by SentiCR, SentiStrength and SentiStrengthSE.

Contributions of The Paper

Comments

No response