Reviewer_1, Q5: The clarity and replicability of the paper are unclear

Suvodeep90 commented 4 years ago

The clarity and replicability of the paper are unclear. Here some examples:

1) "The logistic regression learner (since it is relatively fast)". What does "fast" mean? How was "fast" computed?

2) "The SMOTE class imbalance correction algorithm [49], which we run on the training data2". Why is SMOTE needed? Why is the problem unbalanced? Please, explain.

3) "and Hall’s CFS feature selector". Which features have been removed and why?

4) "As to CFS, we found that without it, our recalls were very low and we could not identify which metrics mattered the most". Where can the reader understand this statement? Where is the replication data? 

5) "extensive studies have found that CFS more useful than many other feature subset selection methods such as PCA or InfoGain or RELIEF". The paper cites just one paper, how should these studies be "extensive"?

6) "Maximize recall and precision and popt(20)": Why should these metrics be maximized? What is the practical value, e.g., is high recall always needed in practice, or should we prefer precision in the problems treated by the bellwether? What about popt(20)?

7) "While minimizing false alarms and ifa auc": Again, please explain the practical relevance of the performance metrics employed. 

8) More in general, how can one replicate the performed study? Is there a replication package? If so, where?

Suvodeep90 commented 4 years ago

The logistic regression learner (since it is relatively fast)". What does "fast" mean? How was "fast" computed? 1) Here fast means the model takes less time to build a predictor using the same data source than other ML models such as random forest or support vector machine. 2) Need to cite that LR is a learner of choice for many defect prediction and transfer learning approaches.

Suvodeep90 commented 4 years ago

"The SMOTE class imbalance correction algorithm [49], which we run on the training data2". Why is SMOTE needed? Why is the problem unbalanced? Please, explain. 1) Smote is needed as the datasets are imbalanced. SMOTE is a *** technique to handle data imbalance problems. 2) The dataset is imbalanced as the collected projects either had too little defect or too many defective modules. This can result in a high pf or low recall, but if we use SMOTE to balance the dataset, then there is a chance of better model being created. cite amrut's better data than bettwe data miner paper and other papers in SE/defect prediction to show SMOTE helps.

Suvodeep90 commented 4 years ago

"and Hall’s CFS feature selector". Which features have been removed and why? explain CFS and the features which have been removed are the ones which don't increase the correlation-based subset evaluation function used in CFS. This means the attributes which are removed are the ones with the most correlation with the class variable.

Suvodeep90 commented 4 years ago

"As to CFS, we found that without it, our recalls were very low and we could not identify which metrics mattered the most". Where can the reader understand this statement? Where is the replication data?

Not sure "Where can the reader understand this statement?", But the replication package with data has been included and has been mentioned in the contribution section.

Suvodeep90 commented 4 years ago

"extensive studies have found that CFS more useful than many other feature subset selection methods such as PCA or InfoGain or RELIEF". The paper cites just one paper, how should these studies be "extensive"? Need to include more citations.

Suvodeep90 commented 4 years ago

"Maximize recall and precision and popt(20)": Why should these metrics be maximized? What is the practical value, e.g., is high recall always needed in practice, or should we prefer precision in the problems treated by the bellwether? What about popt(20)?

Need to include the definitions of recall, precision and popt(20) and show we need to maximize these to get better model performance.

Suvodeep90 commented 4 years ago

"While minimizing false alarms and ifa auc": Again, please explain the practical relevance of the performance metrics employed.

Need to include the definitions of false alarms and ifa auc and show we need to minimize these to get better model performance.

Suvodeep90 commented 4 years ago

More in general, how can one replicate the performed study? Is there a replication package? If so, where? More details about the method need to be written and a replication package is there in the contribution section.

ai-se / BUBBLE_TSE

Reviewer_1, Q5: The clarity and replicability of the paper are unclear #6