Open zdeveloper opened 6 days ago
potential datasets: medical reports: https://huggingface.co/datasets/AnubhutiBhardwaj/medical-reports-demo medical prescriptions: https://huggingface.co/datasets/Technoculture/medical-prescriptions fake tax forms: https://huggingface.co/datasets/singhsays/fake-w2-us-tax-form-dataset
Per discussion - the goal of this is to build a dataset to test the OCR not alignment.
Create a standardized form testing dataset based on a readily available dataset, this will be used for building a standard benchmark for ReportVision
Acceptance Criteria
reportvision-dataset-1
Additional context Example of datasets came be coming from NIST or huggingface Please be very communicative and the AC can be adjusted