ai-se / ML-assisted-SLR

Automated Systematic Literature Review
2 stars 2 forks source link

Retrieval per Review #30

Open azhe825 opened 8 years ago

azhe825 commented 8 years ago

Retrieval per Review = the derivative

Test to see which is the cost efficient point to stop reading: at X% (X=80,85,90,95,99...) retrieval rate?

80% or 85% retrieval rate seems most cost efficient. I would choose 90% since we want more completeness and the sacrifice on efficiency is not much. Any suggestions?

Want bar chart at 80%, 85%, 90%, 95%, 99% for the above?

timm commented 8 years ago

so retrieval rate is your proposed stopping rule?

azhe825 commented 8 years ago

In this paper. I would just assume every review stop at 90% retrieval rate. It is not a practical stop rule (since in reality we dont know the number of studies need to be retrieved), but it can provide us a way to compare different algorithms.

The bootstrap + a12 test in this is based on how many studies need to be reviewed in order to achieve 90% retrieval rate.

In our next paper, when we discuss specific stop rule, we will have two options now (can compare them):

  1. fit a curve, predict for the number of studies need to be retrieved, then apply 90% retrieval rate stop rule.
  2. use other scores to decide whether to stop (prediction probability score from SVM, derivative for last N iterations...)

My question is, do we put one of the above graph in our current paper to justify the 90% retrieval rate? Or a bar chart of review cost at 80%, 85%, 90%, 95%, 99%?