zeljebbari / aishore

Repo for Aishore
0 stars 0 forks source link

[Document parsing] Parse document into json file using open source models Azure Document Intelligence model #9

Open zeljebbari opened 3 months ago

zeljebbari commented 3 months ago

Document parsing: we will use Azure Functions to first convert the document into a pdf readable format. Based on the document type, we will establish workflows for document parsing. The output will be a json file from which we will extract raw data into SQL database. Compile results and findings by document type in a doc. Have also a confidence score (H/M/L) that will help focus the data steward’s efforts by each key value pair Sample docs are here: https://drive.google.com/drive/folders/1ecbyEqSgZBNvUfMKWwBaLtgGQ7eXH_V6 Doc findings here: https://docs.google.com/document/d/1DI_7p1nszRWzXiymGZS6ej2AWDuUDADYeFCUOS6lHa4/edit