Test Edge Probing trained on all corpora sets, test on unanimous FPI metaphor annotation

PolinaZulik / metaphor-psycho

0 stars 0 forks source link

Open PolinaZulik opened 2 years ago

PolinaZulik commented 2 years ago

train data:

LCC, paper,
Yulia's verbs, paper,
wiktionary verbs, no paper, I downloaded Russian wiktionary and took their direct and 'transfer' meanings.

test data only those rows where Elena and Viktoria agree .

code #9 .

PolinaZulik commented 2 years ago

code additions needed:

PolinaZulik commented 2 years ago

model = 'xlm-roberta-base' Metaphoric class: Train Corpus	precision	recall	f1-score
wiktionary	0.14	0.44	0.21
yulia	0.17	0.56	0.26
lcc	0.35	0.25	0.30

very poor performance.. will try to combine train/dev datasets

PolinaZulik commented 2 years ago

results on LCC training are here.

PolinaZulik commented 2 years ago

Metaphoric class: Model	Train Corpus	precision	recall	f1-score
xlm-roberta-base	lcc+yulia	0.34	0.34	0.34
xlm-roberta-base	lcc+wiktionary	0.35	0.26	0.30
xlm-roberta-base	yulia+wiktionary	0.17	0.48	0.25
xlm-roberta-base	lcc+yulia+wiktionary	0.34	0.28	0.31
DeepPavlov/rubert-base-cased-conversational	lcc+yulia	0.33	0.40	0.36
DeepPavlov/distilrubert-base-cased-conversational	lcc+yulia	0.32	0.37	0.34

lcc+yulia -trained results on xlm-roberta-base are here.

PolinaZulik commented 2 years ago

I think I'll annotate the corpus myself, and probably only take the contexts agreed on by all the 3 annotators.

PolinaZulik commented 2 years ago

annotated the 101 files myself. results on 2,094 wordforms agreed on by all 3 annotators:

Metaphoric class: Model	Train Corpus	precision	recall	f1-score
xlm-roberta-base	lcc+yulia	0.44	0.37	0.40
DeepPavlov/rubert-base-cased-conversational	lcc+yulia	0.38	0.45	0.41

will try on Vika+Polina agreed annotation only, because their agreement is much higher.

PolinaZulik commented 2 years ago

Vika+Polina agreement: 2,342 contexts, 466 metaphoric (19.90%), 1,876 literal (80.10%).

Metaphoric class: Model	Train Corpus	precision	recall	f1-score
xlm-roberta-base	lcc+yulia	0.52	0.31	0.39
DeepPavlov/rubert-base-cased-conversational	lcc+yulia	0.51	0.37	0.43

PolinaZulik commented 2 years ago

Majority: 2,643 contexts, 507 metaphoric (19.18%), 2,136 literal (80.82%).

Metaphoric class: Model	Train Corpus	precision	recall	f1-score
xlm-roberta-base	lcc+yulia	0.45	0.30	0.36
DeepPavlov/rubert-base-cased-conversational	lcc+yulia	0.51	0.37	0.43

PolinaZulik commented 2 years ago

ToDO:

PolinaZulik commented 2 years ago

Metaphoric class: Model	Annotator	Train Corpus	Metaphor N, %	precision	recall	f1-score
DeepPavlov/rubert-base-cased-conversational	Elena	317, 11.99	lcc+yulia	0.29	0.41	0.34
DeepPavlov/rubert-base-cased-conversational	Victoria	611, 23.12	lcc+yulia	0.43	0.32	0.37
DeepPavlov/rubert-base-cased-conversational	Polina	622, 23.53	lcc+yulia	0.56	0.42	0.48
DeepPavlov/rubert-base-cased-conversational - properly cased	Polina	622, 23.53	lcc+yulia	0.55	0.40	0.46

the model identifies 459 metaphors (17.37% of data). the last line ('properly cased') is used for correlation analysis.