Garrafao / durel_system_annotators

3 stars 0 forks source link

Run final integration test on previous data to validate final re-factoring #30

Closed Garrafao closed 7 months ago

Garrafao commented 7 months ago

Comparison scores from annotator runs before re-factoring:

Capture_1 Capture_2
shafqatvirk commented 7 months ago

Latest scores after re-factoring:

DWUG_DE accuracy correlation p-value 0.778 0.516 0.0

DWUG_EN accuracy correlation p-value 0.756 0.499 0.0

DWUG_SV accuracy correlation p-value 0.764 0.447 0.0

TempoWic_Train accuracy correlation p-value 0.513 -0.125 2.1678358774887747e-06

TempoWic_Trial accuracy correlation p-value 0.5 -0.272 0.2456957956063111

TempoWic_Validation accuracy correlation p-value 0.51 -0.118 0.018454072074192394

Wic_Dev accuracy correlation p-value 0.887 0.795 4.430455446790286e-140

Wic_Test accuracy correlation p-value 0.662 0.376 4.029358404835062e-48

Wic_Train accuracy correlation p-value 0.889 0.798 0.0

testgug_en accuracy correlation p-value 0.898 0.794 4.253365504873996e-214

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold dwug_de_median NA 0.601 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold dwug_en_median NA 0.583 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold dwug_sv_median NA 0.564 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold tempowic_train NA -0.246 3.572459678427643e-21

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold tempowic_trial NA -0.332 0.1521878698806116

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold tempowic_validation NA -0.188 0.00017235823158717802

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold testwug_en_transformed_median NA 0.825 6.220291975603303e-245

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold wic_dev NA 0.909 3.082802754404488e-243

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold wic_test NA 0.414 5.630338320254875e-59

Annotator Data accuracy correlation p-value XL-Lexeme-Multi-Threshold wic_train NA 0.915 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine dwug_de NA 0.61 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine dwug_de_median NA 0.61 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine dwug_en NA 0.598 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine dwug_en_median NA 0.598 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine dwug_sv NA 0.573 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine dwug_sv_median NA 0.573 0.0

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine tempowic_train NA -0.31 4.32970849159845e-33

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine tempowic_trial NA -0.389 0.08968881889529948

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine tempowic_validation NA -0.205 4.092545817880166e-05

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine testwug_en_transformed_median NA 0.814 3.323558053916412e-233

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine testwug_en_transformed_binarize-median NA 0.802 1.1355171676745905e-221

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine wic_dev NA 0.847 6.3474350881542155e-177

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine wic_test NA 0.493 1.6089625897990044e-86

Annotator Data accuracy correlation p-value XL-Lexeme-Cosine wic_train NA 0.857 0.0