Closed jarmoza closed 12 months ago
A sample of what a config yaml will contain:
BOOKS_DIRECTORY: "/ocean/projects/hum160002p/shared/books/test_autocrop/"
OUTPUT_DIRECTORY: "/ocean/projects/hum160002p/shared/books/test_autocrop/output/"
QA_TYPE: "autocrop"
COMMANDS: ["clear_output", "run_qa", "collate_results"]
Stable and mostly complete (for this iteration) as of https://github.com/printprobability/qa-workflow/commit/3d0e66813934584d504c3f0f36fa444e05dda9a3
The idea here is to build a master script for QAing our cropping methods, but also something that can be extended for other parts of the book ingestion pipeline like line extraction.