RFW0138: Training T5 like model to Post Correction of OCR output.
Summary
We need to train T5 like transformer base model to post correct OCR output.
Key Concepts
OCR: Optical Character Recognition
T5: T5 is a transformer base model.
Context
After discussing with Sebastain we came to know that with T5 model we can train different type of sequence to sequence models. Among them there is possibility of train a model where our input data will be incorrect ocr output data and the labelled data will be the corrected text. Training such model will be usefull to post correct our OCR output. With that we can make the inference review will be faster as they will get almost perfect data. In future we can use this model for correcting other OCR model output.
Outputs
T5 model for post correcting OCR output
Inputs
OCR output data and manually corrected output
Timeline
Specify the expected delivery date for the project.
References
Include any relevant links or resources for additional context or information.
RFW0138: Training T5 like model to Post Correction of OCR output.
Summary
We need to train T5 like transformer base model to post correct OCR output.
Key Concepts
OCR: Optical Character Recognition T5: T5 is a transformer base model.
Context
After discussing with Sebastain we came to know that with T5 model we can train different type of sequence to sequence models. Among them there is possibility of train a model where our input data will be incorrect ocr output data and the labelled data will be the corrected text. Training such model will be usefull to post correct our OCR output. With that we can make the inference review will be faster as they will get almost perfect data. In future we can use this model for correcting other OCR model output.
Outputs
T5 model for post correcting OCR output
Inputs
OCR output data and manually corrected output
Timeline
Specify the expected delivery date for the project.
References
Include any relevant links or resources for additional context or information.