Automated aligned translation candidates

sillsdev / silnlp

A set of pipelines for performing experiments on various NLP tasks with a focus on resource-poor/minority languages.

Other

30 stars 4 forks source link

You should check out the silnlp.alignment.visualize_similarity script. It computes the alignment scores for all project pairs in a country or language family. It then generates a hierarchical (dendrogram) or network graph based on the scores. It can also combine all of the scores by language, so that you can visualize the relationship between languages. It is intended to work on the biblical-humanities-corpus. This is a private repo that contains thousands of Bible translations. We could certainly extend it to support other clustering algorithms. Here is an example of the output: india-language-tree

sillsdev / silnlp

Automated aligned translation candidates #402