Lingtrain Aligner is a tool for extracting parallel corpora from texts in different languages.
Automated alignment process relies on the sentence embeddings models. Embeddings are multidimensional vectors of a special kind which are used to calculate a distance between the sentences. You can also plug your own model using the interface described in models directory. Supported languages list depend on the selected backend model.
The project was supported by the Center for Academic Development of Students within the framework of the Competition of initiative collective research projects of students of the National Research University "Higher School of Economics". |
For the quick overview of the alignment process and main functionality you can watch the demo which was helded on the AINL Conference.
Alignment process is pretty straightforward. After you have the app up and running follow the instructions to start the process. To start the app locally see the Running from Docker Hub section.
You can run the application on your computer using docker.
Make sure that docker is installed by typing the docker version
command in your console.
Images configured to run locally are available on Docker Hub.
Run the following commads in your console:
docker pull lingtrain/aligner:st
docker run -p 80:80 lingtrain/aligner:st
App will be available in your browser on the localhost
address.
You can deploy and run the app on your server using docker.
On your local machine.
export const API_URL = "http://[IP_ADRESS]:[PORT]";
For example:export const API_URL = "http://89.23.34.12:5000";
docker build . -t aligner:v1
docker login
docker tag aligner my_docker_hub_account/aligner:v1
docker push my_docker_hub_account/aligner:v1
On your server.
docker version
command in your console.mkdir /opt/data /opt/img
docker pull my_docker_hub_account/aligner:v1
docker run -v /opt/data:/app/data -v /opt/img:/app/static/img -p [PORT]:80 my_docker_hub_account/aligner:v1
Flask/uwsgi backend REST API service. It's pretty simple and contains all the alignment logic.
python main.py
SPA. Vue + vuex + vuetify. UI for managing alignment process using BE and a tool for translators to edit processing documents.
npm install
npm run serve
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.