ai-se / ML-assisted-SLR

Automated Systematic Literature Review
2 stars 2 forks source link

Variance #26

Closed azhe825 closed 8 years ago

azhe825 commented 8 years ago

Problem

From the following two figures, looks like Hall data set has much larger variance than Wahono.

Hall:

Wahono:

Reason

Hall has some bad luck when random sampling. The prevalence of "relevant" is larger than 0.01, the possibility of not getting a single "relevant" when reviewing the first 200 studies is (1-0.99)^(200)=13%. For our 10 repeat experiments, there should be 1 to 2 out of 10 repeats that stay at 0 "relevant" at 200 studies reviewed, which will not cause a big iqr. However, in the experiments shown above, we got 3, and the iqr at 200 is therefore extremely large. On the other hand, in Wahono, with a little bit luck, we always got more than 1 "relevant" studies retrieved at 200 reviewed, which leads to a low iqr.

The Hall should look more like this:

And Wahono should look more like this:

Conclusion

Probably 10 repeats is not enough. Should we increase it to 25?

timm commented 8 years ago

i think this is enough to explain the large bump. good text for the paper.

"Hall has some bad luck when random sampling. The prevalence of "relevant" is larger than 0.01, the possibility of not getting a single "relevant" when reviewing the first 200 studies is (1-0.99)^(200)=13%. For our 10 repeat experiments, there should be 1 to 2 out of 10 repeats that stay at 0 "relevant" at 200 studies reviewed, which will not cause a big iqr. However, in the experiments shown above, we got 3, and the iqr at 200 is therefore extremely large. On the other hand, in Wahono, with a little bit luck, we always got more than 1 "relevant" studies retrieved at 200 reviewed, which leads to a low iqr."

the experiment i want to see is the "update" experiment where we divide data into (say) pre 2007 and post 2007

azhe825 commented 8 years ago

I am coding on the update experiment. Also, some information on #27