Helsinki-NLP / OpusFilter

OpusFilter - Parallel corpus processing toolkit
MIT License
101 stars 18 forks source link

Better word alignment filter #52

Open svirpioj opened 2 years ago

svirpioj commented 2 years ago

SimAlign has been reported to perform sometimes better than eflomal, and there are some caveats in the WordAlignFilter based on eflomal. SimAlign has a nice-looking Python implementation at https://github.com/cisnlp/simalign, so integration should not be too difficult.

anmoisio commented 1 year ago

Another option is awesome-align which has been reported to perform better than both SimAlign and eflomal, and offers also methods to finetune the aligner.