OpenPecha / Requests

RFWs and RFCs for all OpenPecha repositories
0 stars 0 forks source link

RFW0137: Training data and its collection process documentation #383

Open kaldan007 opened 8 months ago

kaldan007 commented 8 months ago

RFW0137: Training data and its collection process documentation

Summary

We need a documentation of our OCR training data and its collection process.

Key Concepts

Training data: Machine learning training data

Context

As of now our OCR training data are scattered in multiple storage infrastructures like S3 and github. We don't have single documentation of where about the data. We need a documentation so that any other ML engineer in future can use our data and train model on different state of art. We want to share the different approaches we have collected data for future references.

Outputs

The documentation of the training data needs to cover following things:

Inputs

Training data

Timeline

Specify the expected delivery date for the project.

References

Quarto is a good platform to document.