Open azhe825 opened 6 years ago
@timm
you know the answer to that. i.e. f
The results delta between Cormack-BM25 and Auto-BM25 is significant.
What I am worried about is whether the implementation delta is big enough; or it is just some small tweaks that does not considered as contribution
now u say "significant". that is a boostrap of t-test statement
did you mean "not a small effect"?
as to whether it is impressive, do the above as a histogram, no iqrs, in google sheets (font sizes= 15pt, width= narrow enough for one column)
what does that look like.
rank , name , med , iqr
----------------------------------------------------
1 , Auto_Rand , 611.00 , 60.00 ( --* | ), 581.00, 611.00, 641.00
1 , BM25 , 630.00 , 40.00 ( * | ), 610.00, 630.00, 650.00
2 , RANDOM , 685.00 , 190.00 ( -----* | ), 580.00, 690.00, 770.00
2 , Auto_Syn , 710.00 , 50.00 ( -* | ), 680.00, 710.00, 730.00
3 , Cormack_BM25 , 810.00 , 80.00 ( -*| ), 770.00, 810.00, 850.00
rank , name , med , iqr
----------------------------------------------------
1 , BM25 , 325.00 , 40.00 ( -* | ), 310.00, 330.00, 350.00
1 , Auto_Rand , 331.00 , 20.00 ( -* | ), 321.00, 331.00, 341.00
1 , Auto_Syn , 360.00 , 40.00 ( --* | ), 330.00, 360.00, 370.00
2 , RANDOM , 380.00 , 140.00 ( ---* | ), 320.00, 380.00, 460.00
2 , Cormack_BM25 , 380.00 , 40.00 ( -* | ), 360.00, 380.00, 400.00
rank , name , med , iqr
----------------------------------------------------
1 , Auto_Syn , 500.00 , 70.00 ( -* | ), 490.00, 500.00, 560.00
2 , BM25 , 615.00 , 50.00 ( -* | ), 590.00, 620.00, 640.00
2 , Auto_Rand , 631.00 , 70.00 ( --* | ), 581.00, 631.00, 651.00
3 , RANDOM , 730.00 , 190.00 ( ----* ), 630.00, 730.00, 820.00
3 , Cormack_BM25 , 780.00 , 70.00 ( |--* ), 750.00, 780.00, 820.00
rank , name , med , iqr
----------------------------------------------------
1 , Auto_Rand , 696.00 , 70.00 ( --* | ), 671.00, 701.00, 741.00
1 , BM25 , 700.00 , 80.00 ( ---* | ), 660.00, 700.00, 740.00
1 , Cormack_BM25 , 720.00 , 60.00 ( ----* | ), 670.00, 720.00, 730.00
2 , Auto_Syn , 740.00 , 30.00 ( -* | ), 720.00, 740.00, 750.00
3 , RANDOM , 785.00 , 180.00 ( ----|-* ), 720.00, 810.00, 900.00
this is all enough to justify that your new auto-bm25 is better than cormack-bm25
as to auto-syn, we beat it in 2 and tie in 1 lose in 1
when can i see the next draft?
On Sun, Jan 7, 2018 at 4:40 AM, Zhe Yu notifications@github.com wrote:
[image: chart] https://user-images.githubusercontent.com/13929197/34648240-d9429cd8-f364-11e7-800c-f623b060e42d.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ai-se/ML-assisted-SLR/issues/74#issuecomment-355810661, or mute the thread https://github.com/notifications/unsubscribe-auth/AAByC76_kYgB8JOfmq_203ZAaotoY_KAks5tIJECgaJpZM4RVK2e .
-- '( :who Tim.Menzies :what (prof phd cs ncstate usa) :does (se ai prog.lang data.science open.science) :url (menzies dot us) :contact (:cell 304.3762859 :skype menzies.tim :facebook timmenzies :twitter timmenzies) ) "Give me the fruitful error any time, full of seeds, bursting with its own corrections. You can keep your sterile truth for yourself." -Vilfredo Pareto
this is enough to justify your method is better than before
also, can you accept your outstanding trello invite?
Working on it. Will get it done before I get back.
Result
Delta
Cormack-BM25: Cormack'15, find the top doc returned by BM25 and label it as 'relevant'. Then start training. Auto-BM25: our method, run BM25 ranking, review in the BM25 ranking order (10 docs each round), if we found at least 1 relevant, start training.
Conclusion
We recommend Auto-BM25: since it is always of top rank, better than other methods except Auto-Rand. Weakness for Auto-Rand (which assumes that one relevant paper is already labeled): it is not always possible to know one of the relevant paper before hand.