Closed markusweigelt closed 2 years ago
In the end, we provide a good set of building blocks here so you don't have to deal with ocrd function calls. If that is what we want?
but I think we should leave the names independent of the applications. I think it is better to split the post processing and rename the function what happens e.g. post_processing_validate .... This set of utility functions gives us more flexibility in usage for applications (Presentation, Production, ...) and the institutions (OCR SLUB, ...).
Splitting up utility functions further – sure.
The problem, though, is that the concrete scenarios each need to be identifyable by name, and the best / most clear way to do that is naming them by their respective applications – hence pre_process_production2
, post_process_production2
or post_process_dfgmets
etc. We could also go a little more neutral by say just exactly what the assumptions on input/output are, e.g. pre_process_from_imgdir
, post_process_to_ocrdir
, pre_process_from_dfgmets
, post_process_to_dfgmets
etc.
Does it fits a little better now?
Atm the naming conventions are
pre_process... from my opinion not exist atm
@bertsky
Should I test, or have you tested already?
Did a test with state before I had changed code with comments of review. I these changes and if it runs I merge the pr.
Ok first and second point sounds reasonable to me.
D'accord to generalize post_process but I think we should leave the names independent of the applications. I think it is better to split the post processing and rename the function what happens e.g. post_processing_validate .... This set of utility functions gives us more flexibility in usage for applications (Presentation, Production, ...) and the institutions (OCR SLUB, ...).