Open PolinaZulik opened 2 years ago
code additions needed:
model = 'xlm-roberta-base' Metaphoric class: Train Corpus | precision | recall | f1-score |
---|---|---|---|
wiktionary | 0.14 | 0.44 | 0.21 |
yulia | 0.17 | 0.56 | 0.26 |
lcc | 0.35 | 0.25 | 0.30 |
very poor performance.. will try to combine train/dev datasets
results on LCC training are here.
Metaphoric class: Model | Train Corpus | precision | recall | f1-score |
---|---|---|---|---|
xlm-roberta-base | lcc+yulia | 0.34 | 0.34 | 0.34 |
xlm-roberta-base | lcc+wiktionary | 0.35 | 0.26 | 0.30 |
xlm-roberta-base | yulia+wiktionary | 0.17 | 0.48 | 0.25 |
xlm-roberta-base | lcc+yulia+wiktionary | 0.34 | 0.28 | 0.31 |
DeepPavlov/rubert-base-cased-conversational | lcc+yulia | 0.33 | 0.40 | 0.36 |
DeepPavlov/distilrubert-base-cased-conversational | lcc+yulia | 0.32 | 0.37 | 0.34 |
lcc+yulia -trained results on xlm-roberta-base are here.
I think I'll annotate the corpus myself, and probably only take the contexts agreed on by all the 3 annotators.
annotated the 101 files myself. results on 2,094 wordforms agreed on by all 3 annotators:
Metaphoric class: Model | Train Corpus | precision | recall | f1-score |
---|---|---|---|---|
xlm-roberta-base | lcc+yulia | 0.44 | 0.37 | 0.40 |
DeepPavlov/rubert-base-cased-conversational | lcc+yulia | 0.38 | 0.45 | 0.41 |
will try on Vika+Polina agreed annotation only, because their agreement is much higher.
Vika+Polina agreement: 2,342 contexts, 466 metaphoric (19.90%), 1,876 literal (80.10%).
Metaphoric class: Model | Train Corpus | precision | recall | f1-score |
---|---|---|---|---|
xlm-roberta-base | lcc+yulia | 0.52 | 0.31 | 0.39 |
DeepPavlov/rubert-base-cased-conversational | lcc+yulia | 0.51 | 0.37 | 0.43 |
Majority: 2,643 contexts, 507 metaphoric (19.18%), 2,136 literal (80.82%).
Metaphoric class: Model | Train Corpus | precision | recall | f1-score |
---|---|---|---|---|
xlm-roberta-base | lcc+yulia | 0.45 | 0.30 | 0.36 |
DeepPavlov/rubert-base-cased-conversational | lcc+yulia | 0.51 | 0.37 | 0.43 |
ToDO:
Metaphoric class: Model | Annotator | Train Corpus | Metaphor N, % | precision | recall | f1-score |
---|---|---|---|---|---|---|
DeepPavlov/rubert-base-cased-conversational | Elena | 317, 11.99 | lcc+yulia | 0.29 | 0.41 | 0.34 |
DeepPavlov/rubert-base-cased-conversational | Victoria | 611, 23.12 | lcc+yulia | 0.43 | 0.32 | 0.37 |
DeepPavlov/rubert-base-cased-conversational | Polina | 622, 23.53 | lcc+yulia | 0.56 | 0.42 | 0.48 |
DeepPavlov/rubert-base-cased-conversational - properly cased | Polina | 622, 23.53 | lcc+yulia | 0.55 | 0.40 | 0.46 |
the model identifies 459 metaphors (17.37% of data). the last line ('properly cased') is used for correlation analysis.
train data:
test data only those rows where Elena and Viktoria agree .
code #9 .