ai-se / ML-assisted-SLR

Automated Systematic Literature Review
2 stars 2 forks source link

Result Summary #22

Closed azhe825 closed 7 years ago

azhe825 commented 7 years ago

Hall Result

Hall, Tracy, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. "A Systematic Literature Review on Fault Prediction Performance in Software Engineering."

Hall Paper IEEExplore
Initial Size 2073 8912
Final Size 136 106

Wahono Result

Wahono, Romi Satria. "A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks." Journal of Software Engineering 1, no. 1 (2015): 1-16.

Wahono Paper IEEExplore
Initial Size 2117 7002
Final Size 71 62

Method Code

Stage 1: Random sampling

Stage 2: Build classifier

Stage 3: Prediction

Data Balancing

Baseline from Medicine #17

Baseline from Litigation #16

H_U_C_A (hasty, uncertainty sampling, continuous, aggressive undersampling) Hasty and continuous suggested by litigation, Uncertainty sampling and aggressive undersampling suggested by Medicine.

timm commented 7 years ago

please change x-axis to #documents

please copy this to nicholas kraft

azhe825 commented 7 years ago

How do I copy this to Dr. Kraft?

nkraft commented 7 years ago

I have joined the repo now!

timm commented 7 years ago

@nkraft : TL;DR

@azhe825 has:

then he took two large SE SLRs and asked "how many papers would i have to read to find the papers that those studies found 'relevant'".

so i think this is publishable as is but as to next steps....

or that's the idea anyway. will it work? well.......

nkraft commented 7 years ago

Very interesting.

First reaction to your next steps: I wonder about expertise vs. cost. MT vs. Ugrads (general population) vs. Ugrads (majors) vs. Ugrads (upper-level majors) vs. Grads vs. Professionals. Where is the sweet spot. And what about sustainability? I could never convince a grad student to help with an SLR a second time. Yet, Tore Dyba cranks them out like a factory line.

When is our meeting scheduled?

nkraft commented 7 years ago

Also, the typical SLR process uses a multi-stage filter: titles then abstracts then papers. Can we model the accuracy vs. cost of each transition? Or does that even matter? Just brainstorming cost model considerations.

azhe825 commented 7 years ago

The meeting is scheduled tomorrow 11am at 3231 EB2, NC State.

azhe825 commented 7 years ago

All our experiments are not actually reviewed by human. The "relevant" examples are taken from existing SLR papers' final inclusion list, which are reviewed by title and abstract and then by full text. However, our algorithm only learns from the title and abstract and achieve the above performance without full text.

Our suggested review process is this.