Unbabel / OpenKiwi

Open-Source Machine Translation Quality Estimation in PyTorch
https://unbabel.github.io/OpenKiwi/
GNU Affero General Public License v3.0
228 stars 48 forks source link
machine-translation openkiwi pytorch pytorch-lightning quality-estimation translation-quality-estimation

OpenKiwi Logo


PyPI version python versions CircleCI Code Climate coverage Code Style GitHub last commit

Open-Source Machine Translation Quality Estimation in PyTorch

Quality estimation (QE) is one of the missing pieces of machine translation: its goal is to evaluate a translation system’s quality without access to reference translations. We present OpenKiwi, a Pytorch-based open-source framework that implements the best QE systems from WMT 2015-18 shared tasks, making it easy to experiment with these models under the same framework. Using OpenKiwi and a stacked combination of these models we have achieved state-of-the-art results on word-level QE on the WMT 2018 English-German dataset.

News

Features

Quick Installation

To install OpenKiwi as a package, simply run

pip install openkiwi

You can now

import kiwi

inside your project or run in the command line

kiwi

Optionally, if you'd like to take advantage of our MLflow integration, simply install it in the same virtualenv as OpenKiwi:

pip install openkiwi[mlflow]

Getting Started

Detailed usage examples and instructions can be found in the Full Documentation.

Contributing

We welcome contributions to improve OpenKiwi. Please refer to CONTRIBUTING.md for quick instructions or to contributing instructions for more detailed instructions on how to set up your development environment.

License

OpenKiwi is Affero GPL licensed. You can see the details of this license in LICENSE.

Citation

If you use OpenKiwi, please cite the following paper: OpenKiwi: An Open Source Framework for Quality Estimation.

@inproceedings{openkiwi,
    author = {Fábio Kepler and
              Jonay Trénous and
              Marcos Treviso and
              Miguel Vera and
              André F. T. Martins},
    title  = {Open{K}iwi: An Open Source Framework for Quality Estimation},
    year   = {2019},
    booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics--System Demonstrations},
    pages  = {117--122},
    month  = {July},
    address = {Florence, Italy},
    url    = {https://www.aclweb.org/anthology/P19-3020},
    organization = {Association for Computational Linguistics},
}

References

[1] Kreutzer et al. (2015): QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation
[2] Martins et al. (2016): Unbabel's Participation in the WMT16 Word-Level Translation Quality Estimation Shared Task
[3] Martins et al. (2017): Pushing the Limits of Translation Quality Estimation
[4] Kim et al. (2017): Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation
[5] Wang et al. (2018): Alibaba Submission for WMT18 Quality Estimation Task
[6] Kepler et al. (2019): Unbabel’s Participation in the WMT19 Translation Quality Estimation Shared Task