Want to review Wahono. Want to retrieve 90%. Both set are targeting at defect prediction, but different RQs. (Different review protocols)
Review protocol for Hall:
Review protocol for Wahono:
What?
REUSE: Import only learned model from Hall, featurize just on Wahono. Use imported Hall model to replace random sampling, then start to learn its own model on Wahono.
UPDATE (Partial): Import only labeled data from Hall, combine with Wahono, and re-featurize. It is partial UPDATE since it can save memory without damaging performance as discussed in #31.
Why?
Target is no longer the same, there are differences between review protocols. Want to build model purely on new data set. Applying UPDATE may be a bad idea.
Result
Conclusion
UPDATE is actually performing better than REUSE.
Is it because of the data sets? Does there exist data sets where UPDATE performs badly but REUSE stays similarly.
When?
Want to review Wahono. Want to retrieve 90%. Both set are targeting at defect prediction, but different RQs. (Different review protocols)
Review protocol for Hall:
Review protocol for Wahono:
What?
REUSE: Import only learned model from Hall, featurize just on Wahono. Use imported Hall model to replace random sampling, then start to learn its own model on Wahono.
UPDATE (Partial): Import only labeled data from Hall, combine with Wahono, and re-featurize. It is partial UPDATE since it can save memory without damaging performance as discussed in #31.
Why?
Result
Conclusion
UPDATE is actually performing better than REUSE.
Is it because of the data sets? Does there exist data sets where UPDATE performs badly but REUSE stays similarly.