It's not too clear in code, but the Uwazi context and OCR context are using different language code:
Uwazi = ISO639_3
OCR = ISO639_1
We were using getLanguage that I believe it was used to convert Uwazi language codes to Elastic search languages codes, as Ukranian is not supported in Elastic search, it was retuning null and not executing the OCR service.
I've created LanguageCodeMapper that can convert between language code.
A better solution, it would be to make OCR models, properties more explicit in code so we can make a well defined boundary that can tell what belongs to Uwazi and what belongs to OCR
With that well defined Models we can create a Anti-Corruption Layer and whenever we need to use some feature from OCR we use this Anti-Corruption Layer to get the models from OCR.
fixes https://github.com/huridocs/uwazi/issues/7439
It's not too clear in code, but the Uwazi context and OCR context are using different language code: Uwazi = ISO639_3 OCR = ISO639_1
We were using getLanguage that I believe it was used to convert Uwazi language codes to Elastic search languages codes, as Ukranian is not supported in Elastic search, it was retuning
null
and not executing the OCR service.I've created LanguageCodeMapper that can convert between language code.
A better solution, it would be to make OCR models, properties more explicit in code so we can make a well defined boundary that can tell what belongs to Uwazi and what belongs to OCR
With that well defined Models we can create a Anti-Corruption Layer and whenever we need to use some feature from OCR we use this Anti-Corruption Layer to get the models from OCR.