Reviewer 2 - Githubissues

WeiFoo commented 9 years ago

[x] Q2: details are missing. the same data ? how was it conducted?
[x] Q2: the numbers in fig 5 and 7 are contradicting...
- actually, this reviewer didn't read it carefully, they're for different tuning goals, the numbers should not be same for the same data set.
[x] Experiments: A fair comparison would use training set and tuning set as the new training set for the non-tuning version.
- we have discussions in reliability and validity
[x] Using an incremental learning approach is fine. One concern is that the time between two versions is not mentioned. If the tuned version is very close in time to the test version, but there is a large gap between the training and the test version, the evaluation could be biased toward the tuned algorithms. Simply mentioning the time gap between the different versions would help alleviate this issue. PARDON??
[x] Tables are called Figures, which is a bit confusing.

timm commented 9 years ago

most of reviewers stuff is sytlistic stuff. the real stumbling block was this:

[x] RQ2 is a great question in making it useful by pointing out problems in existing published studies. However, RQ2 is only conducted against one paper [27], and even this one study has data and presentation issues (see below). At best, we have evidence on one paper. So with so little evidence, is the evidence strong enough for publication?
a VERY highly cited paper, to say the least
representative of a broad class of papers... see the tracy hall paper

Still, do we need to refute a SECOND paper (e.g. the tracy hall paper http://dl.acm.org/citation.cfm?id=2420790). From that paper I read the following:

"Overall our comparative analysis suggests that studies using Support Vector Machine (SVM) techniques perform less well. ... Models based on C4.5 seem to underperform if they use imbalanced data (e.g. Arisholm et al [[8]] and [[9]]), as the technique seems to be sensitive to this. Our comparative analysis also suggests that the models performing comparatively well are relatively simple techniques that are easy to use and well understood. Naïve Bayes and Logistic regression, in particular, seem to be the techniques used in models that are performing relatively well. "

[ ] can we tune SVM to out-perform others?
[ ] can we make learners other than C4.5 do well on imbalanced data?
[ ] can we make more complex learners outperform Naïve Bayes and Logistic regression,

[[8]] E. Arisholm, L. C. Briand, and M. Fuglerud, “Data mining techniques for building fault-proneness models in telecom java software,” in Software Reliability, 2007. ISSRE ’07. The 18th IEEE International Symposium on, nov. 2007, pp. 215 –224. (Paper=8, Status=P) [[9]] E. Arisholm, L. C. Briand, and E. B. Johannessen, “A systematic and comprehensive investigation of methods to build and evaluate fault prediction models,” Journal of Systems and Software, vol. 83, no. 1, pp. 2–17, 2010. (Paper=9, Status=P)

WeiFoo commented 9 years ago

George,@bigfatnoob, did you do some tuning stuff on SVM for efforts estimation? are there any improvements or do you have any comments there?

bigfatnoob commented 9 years ago

@WeiFoo Here is the link for my results on tuning SVM https://github.com/ai-se/x-effort/blob/master/Reports/05-07-15/Evals.md

The ones prefixed with "t_" are tuned.

Tuning helps improve SVM since changing a kernel drastically changes results.

WeiFoo commented 9 years ago

thanks !! @bigfatnoob

timm commented 9 years ago

@WeiFoo we cant use @bigfatnoob 's results in this context since the hall paper is about defect prediction, not effort estimation

t

WeiFoo commented 9 years ago

Yes, I know. I want to get some sense from George's result. I will do it.

Sent from my iPhone

On Aug 4, 2015, at 14:22, Tim Menzies notifications@github.com wrote:

@WeiFoo we cant use @bigfatnoob 's results in this context since the hall paper is about defect prediction, not effort estimation

t

— Reply to this email directly or view it on GitHub.

WeiFoo commented 9 years ago

please see #15 about naive bayes and logistic regression

ai-se / tunelearners

Reviewer 2 #4