ai-se / ML-assisted-SLR

Automated Systematic Literature Review
2 stars 2 forks source link

how to start #74

Open azhe825 opened 6 years ago

azhe825 commented 6 years ago

Result

X95 WSS@95
Dataset Treatment median iqr median iqr
Wahono (Full) Auto-BM25 630 40 0.86 0.01
Auto-Syn 710 50 0.85 0.01
RANDOM 685 188 0.85 0.03
Cormack-BM25 810 78 0.83 0.01
Auto-Rand 611 58 0.86 0.01
Hall (Full) Auto-BM25 325 40 0.91 0.00
Auto-Syn 360 40 0.91 0.00
RANDOM 380 140 0.91 0.02
Cormack-BM25 380 38 0.91 0.00
Auto-Rand 331 20 0.91 0.00
Radjenovi{\'c} (Full) Auto-BM25 615 50 0.85 0.01
Auto-Syn 500 65 0.87 0.01
RANDOM 730 188 0.83 0.03
Cormack-BM25 780 68 0.82 0.01
Auto-Rand 631 70 0.84 0.01
Kitchenham (Full) Auto-BM25 700 78 0.54 0.05
Auto-Syn 740 28 0.52 0.02
RANDOM 785 175 0.49 0.10
Cormack-BM25 720 58 0.53 0.03
Auto-Rand 696 68 0.54 0.04

Delta

Cormack-BM25: Cormack'15, find the top doc returned by BM25 and label it as 'relevant'. Then start training. Auto-BM25: our method, run BM25 ranking, review in the BM25 ranking order (10 docs each round), if we found at least 1 relevant, start training.

Conclusion

We recommend Auto-BM25: since it is always of top rank, better than other methods except Auto-Rand. Weakness for Auto-Rand (which assumes that one relevant paper is already labeled): it is not always possible to know one of the relevant paper before hand.

azhe825 commented 6 years ago

@timm

timm commented 6 years ago

you know the answer to that. i.e. f

azhe825 commented 6 years ago

The results delta between Cormack-BM25 and Auto-BM25 is significant.

What I am worried about is whether the implementation delta is big enough; or it is just some small tweaks that does not considered as contribution

timm commented 6 years ago

now u say "significant". that is a boostrap of t-test statement

did you mean "not a small effect"?

as to whether it is impressive, do the above as a histogram, no iqrs, in google sheets (font sizes= 15pt, width= narrow enough for one column)

what does that look like.

azhe825 commented 6 years ago

Wahono

rank ,         name ,    med   ,  iqr
----------------------------------------------------
   1 ,    Auto_Rand ,    611.00  ,  60.00 (    --*        |              ), 581.00,  611.00,  641.00
   1 ,         BM25 ,    630.00  ,  40.00 (      *        |              ), 610.00,  630.00,  650.00
   2 ,       RANDOM ,    685.00  ,  190.00 (    -----*     |              ), 580.00,  690.00,  770.00
   2 ,     Auto_Syn ,    710.00  ,  50.00 (         -*    |              ), 680.00,  710.00,  730.00
   3 , Cormack_BM25 ,    810.00  ,  80.00 (             -*|              ), 770.00,  810.00,  850.00

Hall

rank ,         name ,    med   ,  iqr
----------------------------------------------------
   1 ,         BM25 ,    325.00  ,  40.00 ( -*            |              ), 310.00,  330.00,  350.00
   1 ,    Auto_Rand ,    331.00  ,  20.00 (  -*           |              ), 321.00,  331.00,  341.00
   1 ,     Auto_Syn ,    360.00  ,  40.00 (  --*          |              ), 330.00,  360.00,  370.00
   2 ,       RANDOM ,    380.00  ,  140.00 (  ---*         |              ), 320.00,  380.00,  460.00
   2 , Cormack_BM25 ,    380.00  ,  40.00 (    -*         |              ), 360.00,  380.00,  400.00

Radjenovi{'c}

rank ,         name ,    med   ,  iqr
----------------------------------------------------
   1 ,     Auto_Syn ,    500.00  ,  70.00 (    -*         |              ), 490.00,  500.00,  560.00
   2 ,         BM25 ,    615.00  ,  50.00 (         -*    |              ), 590.00,  620.00,  640.00
   2 ,    Auto_Rand ,    631.00  ,  70.00 (         --*   |              ), 581.00,  631.00,  651.00
   3 ,       RANDOM ,    730.00  ,  190.00 (           ----*              ), 630.00,  730.00,  820.00
   3 , Cormack_BM25 ,    780.00  ,  70.00 (               |--*           ), 750.00,  780.00,  820.00

Kitchenham

rank ,         name ,    med   ,  iqr
----------------------------------------------------
   1 ,    Auto_Rand ,    696.00  ,  70.00 (       --*     |              ), 671.00,  701.00,  741.00
   1 ,         BM25 ,    700.00  ,  80.00 (      ---*     |              ), 660.00,  700.00,  740.00
   1 , Cormack_BM25 ,    720.00  ,  60.00 (       ----*   |              ), 670.00,  720.00,  730.00
   2 ,     Auto_Syn ,    740.00  ,  30.00 (           -*  |              ), 720.00,  740.00,  750.00
   3 ,       RANDOM ,    785.00  ,  180.00 (           ----|-*            ), 720.00,  810.00,  900.00
azhe825 commented 6 years ago

chart

timm commented 6 years ago

this is all enough to justify that your new auto-bm25 is better than cormack-bm25

as to auto-syn, we beat it in 2 and tie in 1 lose in 1

when can i see the next draft?

On Sun, Jan 7, 2018 at 4:40 AM, Zhe Yu notifications@github.com wrote:

[image: chart] https://user-images.githubusercontent.com/13929197/34648240-d9429cd8-f364-11e7-800c-f623b060e42d.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ai-se/ML-assisted-SLR/issues/74#issuecomment-355810661, or mute the thread https://github.com/notifications/unsubscribe-auth/AAByC76_kYgB8JOfmq_203ZAaotoY_KAks5tIJECgaJpZM4RVK2e .

-- '( :who Tim.Menzies :what (prof phd cs ncstate usa) :does (se ai prog.lang data.science open.science) :url (menzies dot us) :contact (:cell 304.3762859 :skype menzies.tim :facebook timmenzies :twitter timmenzies) ) "Give me the fruitful error any time, full of seeds, bursting with its own corrections. You can keep your sterile truth for yourself." -Vilfredo Pareto

timm commented 6 years ago

this is enough to justify your method is better than before

also, can you accept your outstanding trello invite?

azhe825 commented 6 years ago

Working on it. Will get it done before I get back.