OpenPecha / Toolkit

🛠 Tools to create, edit and export texts and annotations
https://toolkit.openpecha.org
Apache License 2.0
7 stars 4 forks source link

Fix hocr parser #190

Closed ta4tsering closed 1 year ago

ta4tsering commented 1 year ago
eroux commented 1 year ago

@10zinten @kaldan007 the pull request is ready, can you review it? This is a major code rewrite for the OCR import

kaldan007 commented 1 year ago

@eroux I run the test and it's working fine. I will be going through the details tomorrow and updating m ocr pipeline.

eroux commented 1 year ago

Thanks! I've changed the tests to make them pass, so it's best to check thoroughly

kaldan007 commented 1 year ago

Sure. will do that tomorrow.

kaldan007 commented 1 year ago

@eroux https://github.com/OpenPecha/Toolkit/blob/fix-hocr-parser/openpecha/formatters/ocr/ocr.py#L638 the variable buda_data is not defined anywhere. r u referring to self.data_provider.buda_data['source_metadata']['languages']

eroux commented 1 year ago

thanks for spotting that @kaldan007 I've addressed it