ai-se / ML-assisted-SLR

Automated Systematic Literature Review
2 stars 2 forks source link

Random Human Error #70

Open azhe825 opened 6 years ago

azhe825 commented 6 years ago

Error

Random error: for each labeling task, human has an ER = 0.00 / 0.02 / 0.05 / 0.10 chance of labeling incorrectly

Error Correction:

none:

Run with FAST2 (BM25+SEMI)

three:

screen shot 2017-10-02 at 9 43 39 am Each doc will be labeled at least 2 times, at most 3 times.

machine

machine

Run error check every CR=50 docs reviewed:

machine2

Same to machine but:

machine3

Same to machine but:

Results

ER = 10%

No Error none three machine machine2 machine3
Hall 102 / 490 / 0 / 0 95 / 2930 / 8 / 276 100 / 1696 / 2 / 17 94 / 722 / 7 / 12 98 / 855 / 3 / 15 98 / 862 / 4 / 9
Wahono 59 / 1160 / 0 / 0 54 / 3315 / 6 / 320 58 / 3823 / 1 / 47 55 / 1721 / 4 / 19 56 / 2217 / 2 / 33 56 / 1919 / 3 / 28
Danijel 45 / 750 / 0 / 0 41 / 2535 / 5 / 248 45 / 2303 / 1 / 28 41 / 1033 / 3 / 10 42 / 1217 / 2 / 18 43 / 1209 / 2 / 16
K_all3 37 / 500 / 0 / 0 30 / 470 / 3 / 46 34 / 964 / 1 / 10 30 / 570 / 4 / 7 30 / 590 / 3 / 10 32 / 614 / 3 / 8
No Error none three machine machine2 machine3
Hall 102 / 490 / 0 / 0 93 / 3290 / 11 / 311 99 / 1608 / 3 / 18 96 / 645 / 5 / 11 98 / 823 / 3 / 16 95 / 776 / 6 / 6
Wahono 59 / 1160 / 0 / 0 54 / 3455 / 6 / 331 58 / 3744 / 1 / 42 56 / 1696 / 4 / 16 56 / 2161 / 3 / 32 55 / 1694 / 4 / 17
Danijel 45 / 750 / 0 / 0 41 / 2705 / 5 / 261 44 / 2183 / 1 / 28 41 / 1136 / 3 / 11 42 / 1248 / 2 / 17 41 / 1171 / 3 / 12
K_all3 37 / 500 / 0 / 0 32 / 485 / 4 / 45 34 / 961 / 1 / 10 29 / 593 / 5 / 6 31 / 636 / 3 / 9 31 / 651 / 5 / 6

ER = 5%

No Error none three machine machine2 machine3
Hall 102 / 490 / 0 / 0 98 / 1385 / 5 / 63 101 / 1128 / 1 / 4 99 / 635 / 2 / 3 100 / 725 / 1 / 3 99 / 679 / 2 / 1
Wahono 59 / 1160 / 0 / 0 57 / 1880 / 3 / 88 59 / 2913 / 0 / 9 58 / 1554 / 1 / 4 58 / 1651 / 1 / 6 58 / 1510 / 1 / 3
Danijel 45 / 750 / 0 / 0 43 / 1060 / 2 / 50 45 / 1755 / 0 / 5 43 / 983 / 2 / 2 44 / 1071 / 1 / 4 43 / 976 / 2 / 2
K_all3 37 / 500 / 0 / 0 33 / 430 / 1 / 20 35 / 997 / 0 / 2 34 / 606 / 3 / 1 34 / 588 / 1 / 3 34 / 602 / 2 / 1

ER = 2%

No Error none three machine machine2 machine3
Hall 102 / 490 / 0 / 0 100 / 635 / 2 / 10 102 / 982 / 0 / 0 100 / 632 / 1 / 1 101 / 639 / 1 / 1 102 / 687 / 0 / 0
Wahono 59 / 1160 / 0 / 0 58 / 1595 / 1 / 34 59 / 2407 / 0 / 1 59 / 1387 / 0 / 1 59 / 1445 / 0 / 1 59 / 1460 / 0 / 1
Danijel 45 / 750 / 0 / 0 44 / 900 / 1 / 15 45 / 1552 / 0 / 1 45 / 948 / 0 / 0 45 / 986 / 0 / 1 45 / 952 / 0 / 0
K_all3 37 / 500 / 0 / 0 33 / 450 / 1 / 7 37 / 1009 / 0 / 0 35 / 573 / 1 / 0 35 / 594 / 0 / 0 37 / 585 / 0 / 0

ER = 0%

none three machine machine2 machine3
Hall 102 / 490 / 0 / 0 102 / 1000 / 0 / 0 102 / 672 / 0 / 0 102 / 682 / 0 / 0 102 / 683 / 0 / 0
Wahono 59 / 1150 / 0 / 0 59 / 2300 / 0 / 0 59 / 1409 / 0 / 0 59 / 1400 / 0 / 0 59 / 1409 / 0 / 0
Danijel 45 / 755 / 0 / 0 45 / 1500 / 0 / 0 45 / 915 / 0 / 0 45 / 925 / 0 / 0 45 / 911 / 0 / 0
K_all3 37 / 500 / 0 / 0 37 / 980 / 0 / 0 37 / 560 / 0 / 0 37 / 566 / 0 / 0 36 / 554 / 0 / 0

metrics

ER = 10%

none three machine machine2 machine3
Hall 0.26 / 0.9 / 2930 0.85 / 0.94 / 1696 0.89 / 0.89 / 722 0.87 / 0.92 / 855 0.92 / 0.92 / 862
Wahono 0.14 / 0.87 / 3315 0.55 / 0.94 / 3823 0.75 / 0.89 / 1721 0.64 / 0.91 / 2217 0.67 / 0.9 / 1919
Danijel 0.14 / 0.85 / 2535 0.61 / 0.94 / 2303 0.79 / 0.85 / 1033 0.7 / 0.88 / 1217 0.73 / 0.9 / 1209
K_all3 0.41 / 0.69 / 470 0.76 / 0.77 / 964 0.81 / 0.68 / 570 0.76 / 0.68 / 590 0.8 / 0.73 / 614
none three machine machine2 machine3
Hall 0.23 / 0.88 / 3290 0.85 / 0.93 / 1608 0.89 / 0.91 / 645 0.86 / 0.92 / 823 0.94 / 0.9 / 776
Wahono 0.14 / 0.87 / 3455 0.57 / 0.94 / 3744 0.77 / 0.9 / 1696 0.63 / 0.9 / 2161 0.77 / 0.9 / 1694
Danijel 0.14 / 0.86 / 2705 0.61 / 0.92 / 2183 0.78 / 0.85 / 1136 0.71 / 0.88 / 1248 0.78 / 0.85 / 1171
K_all3 0.41 / 0.73 / 485 0.77 / 0.77 / 961 0.84 / 0.66 / 593 0.76 / 0.7 / 636 0.84 / 0.7 / 651

ER = 5%

none three machine machine2 machine3
Hall 0.61 / 0.92 / 1385 0.96 / 0.95 / 1128 0.97 / 0.93 / 635 0.96 / 0.94 / 725 0.99 / 0.93 / 679
Wahono 0.39 / 0.92 / 1880 0.87 / 0.95 / 2913 0.94 / 0.94 / 1554 0.9 / 0.94 / 1651 0.95 / 0.94 / 1510
Danijel 0.46 / 0.9 / 1060 0.9 / 0.94 / 1755 0.96 / 0.9 / 983 0.92 / 0.92 / 1071 0.95 / 0.9 / 976
K_all3 0.63 / 0.75 / 430 0.94 / 0.8 / 997 0.97 / 0.77 / 606 0.92 / 0.77 / 588 0.97 / 0.77 / 602

ER = 2%

none three machine machine2 machine3
Hall 0.91 / 0.94 / 635 1.0 / 0.96 / 982 0.99 / 0.94 / 632 0.99 / 0.95 / 639 1.0 / 0.96 / 687
Wahono 0.63 / 0.94 / 1595 0.98 / 0.95 / 2407 0.98 / 0.95 / 1387 0.98 / 0.95 / 1445 0.98 / 0.95 / 1460
Danijel 0.75 / 0.92 / 900 0.98 / 0.94 / 1552 1.0 / 0.94 / 948 0.98 / 0.94 / 986 1.0 / 0.94 / 952
K_all3 0.82 / 0.75 / 450 1.0 / 0.84 / 1009 1.0 / 0.8 / 573 1.0 / 0.81 / 594 1.0 / 0.84 / 585

ER = 0%

none three machine machine2 machine3
Hall 1.0 / 0.96 / 490 1.0 / 0.96 / 1000 1.0 / 0.96 / 671 1.0 / 0.96 / 682 1.0 / 0.96 / 683
Wahono 1.0 / 0.95 / 1150 1.0 / 0.95 / 2300 1.0 / 0.95 / 1409 1.0 / 0.95 / 1390 1.0 / 0.95 / 1409
Danijel 1.0 / 0.94 / 755 1.0 / 0.94 / 1500 1.0 / 0.94 / 915 1.0 / 0.94 / 925 1.0 / 0.94 / 911
K_all3 1.0 / 0.85 / 500 1.0 / 0.84 / 980 1.0 / 0.84 / 556 1.0 / 0.84 / 563 1.0 / 0.83 / 557
azhe825 commented 6 years ago

Three: highest recall, good precision, highest cost machine3: highest precision, close to highest recall, almost least cost