Open timm opened 5 years ago
(a) labelling is expensive. as shown in this appear, it take take months to complete and 100s of dollars. for this reason, researchers rarely relabel other data sets. which means that any errors in the prior label cascade over the community
(b) state of the art. for the most part, a 2002 keyword labelling method. can perform badly on new data sets.
State-of-the-art labelling papers from the reviewers:
Wu, Rongxin, et al. "Relink: recovering links between bugs and changes." Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 2011.
Tian, Yuan, Julia Lawall, and David Lo. "Identifying linux bug fixing patches." Proceedings of the 34th International Conference on Software Engineering. IEEE Press, 2012.
(c) using incremental. active earning. we can do the labelling in a. cost effective manner are verify the labelling along the way. and as we show our labeling singiifncalty improves predictive performance
(d) RQs
(a) State the problem to be solved.
(b) Discuss the state of the art (i.e., previous work) and explain why, despite/because of this literature, there remains: (i) confusion; (ii) misunderstanding; (iii) errors; or (iv) some unresolved problem. Alternatively, present an empirical puzzle that the existing literature fails to explain.
(c) State the essence of your contribution, that is, your solution to the problem or puzzle. Give the reader a sense of how you will solve the problem; provide some confidence that if she reads the rest of your paper, she has a chance of learning something.
(d) The last paragraph of your introduction should always be a "road map" paragraph; for example: "This paper proceeds as follows. In section 1 ..."