In addition to #162 and #164, there are two points in https://github.com/OpenPecha/OCR-Helper-Scripts/issues/6 that are probably best tracked in their own issue. It consists in adding some information about the OCR in the opf, so that we know what version of the ocr of a set of scans has been used to create an opf (I don't think we can now).
When the OCR is imported from, say s3://ocr.bdrc.io/Works/83/W2PD17457/google_books/batch_2022/, there's an info.json that contains at least
{
"timestamp": "1977-04-22T06:00:00Z"
}
but possibly other things, so let's say it has other properties like:
In addition to #162 and #164, there are two points in https://github.com/OpenPecha/OCR-Helper-Scripts/issues/6 that are probably best tracked in their own issue. It consists in adding some information about the OCR in the opf, so that we know what version of the ocr of a set of scans has been used to create an opf (I don't think we can now).
When the OCR is imported from, say
s3://ocr.bdrc.io/Works/83/W2PD17457/google_books/batch_2022/
, there's aninfo.json
that contains at leastbut possibly other things, so let's say it has other properties like:
Then we should add the following to the
meta.yml
(I've put the parser there, I think it fits well)