Closed azhe825 closed 7 years ago
Grayscale version:
Not good.
Yeah, that is unreadable. Possible to use different line styles?
13 lines are so hard to display in a single graph, not to say we have actually 13 median lines + 13 iqr lines. Planning to shelf this problem until the first draft of this paper is finished.
seperate into three groups, ABC
show all the IQRs together on a seperate graph
Within each group, one graph.
Then how do I compare the performance of each group?
they will all have the same x-axis
they will be shown in the paper one under each other.
Got it. The first draft of "How to read less" is ready on sharelatex. Will work on the figures soon.
Hall Result
Hall, Tracy, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. "A Systematic Literature Review on Fault Prediction Performance in Software Engineering."
Wahono Result
Wahono, Romi Satria. "A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks." Journal of Software Engineering 1, no. 1 (2015): 1-16.
Comparisons for each code:
Start with patient active learning (P_U_S_A) First compare the last code P_U_S_A vs. P_U_S_N
P_U_S_A wins.
For last code, A is better than N (aggressive undersampling is useful)
Compare the third code P_U_S_A vs. P_U_C_A
No clear winner. I would prefer C over S, since continuous learning can handle concept drift better.
But let's keep both
Compare the second code P_U_S_A vs. P_U_C_A vs. P_C_C_A
No clear winner. I prefer C over U since no need to worry about stop rule for U (margin threshold of SVM)
But let's keep all three
Compare the first code P_U_S_A vs. P_U_C_A vs. P_C_C_A vs. H_U_S_A vs. H_U_C_A vs. H_C_C_A
H is better than P.
But H_C_C_A is a clear loser.
Start aggressive undersampling with only one "relevant" example is a bad idea.
The final winner would be H_U_S_A and H_U_C_A.
I would prefer H_U_C_A for continuous learning to handle concept drift and updating of SLR.
Email SLR authors
How much effort does it cost for a primary study selection? (N reviewers, T time for each)
Better if details of each step can be provided:
Is there any effort taken (a hidden step) between applying the search string to databases and the initial candidate study list has been collected?
Why ask: I retrieve much more candidate studies with the same search string provided in the SLR paper.
If there is any effort, what is that? How much does it cost? What is the reason behind?
One reason I guess is to reduce the size of initial candidate study list, thus reduce the review cost of primary study selection. If this is true, learning based primary study selection can remove this hidden step since it can search in much larger candidate study list and retrieve above 90% "relevant" with less effort. This may even improve the overall completeness and save the effort of this hidden step.