problem with Georgian data in OpenSubtitles

https://opus.nlpl.eu/OpenSubtitles/en&ka/v2018/OpenSubtitles

Almost every data point is damaged. Georgian part is nonsense. When I searched those data in OpenSubtitle site, I found out that those are just Russian characters mapped onto Georgian alphabet. Nowadays many multilingual model is poisoned because of that data. It would be great to investigate more into that topic.

Helsinki-NLP / OPUS

problem with Georgian data in OpenSubtitles #16