janetzki / GUIDE

Create semantic domain dictionaries for low-resource languages
MIT License
4 stars 0 forks source link

Try out AWESoME/eflomal word aligner #14

Closed janetzki closed 1 year ago

janetzki commented 1 year ago

image image

Goal

As a software developer, I want to try out replacing fast_align with the AWESoME word aligner to see if this improves the dictionary creator's F1 score. Motivation: improve alignment -> reduce FPs -> increase DC's precision

Tasks

F1* / MRR

Language pair fast_align (baseline) mBERT fine-tuned by AWESoME Eflomal
eng-eng 0.25, 0.40 0.26, 0.38 0.27, 0.38 0.23, 0.39
eng-fra 0.24, 0.37 0.27, 0.38 0.27, 0.36 0.26, 0.41
eng-tpi n/a, 0.28 n/a, 0.19 n/a, 0.20 n/a, 0.31
eng-meu n/a, 0.23 n/a, 0.11 n/a, 0.11 n/a, 0.21