Intern project : Document process mediator

Yasas2000 commented 6 days ago

Purpose

The main purpose of this project is to develop a new mediator for WSO2 Micro Integrator Studio that can process documents with higher accuracy and intelligence using GPT Vision. This mediator aims to address the need for efficient extraction and conversion of data from documents like bank statements and application forms into a more manageable JSON format. This resolves issues related to manual data extraction and inaccurate document processing.

Goals

The mediator will provide solutions to: --Accurately process and extract data from various document formats, including images and PDFs. --Convert extracted data into JSON format for easy handling and manipulation. --Allow users to customize the output by providing schema definitions in JSON or XSD formats. --Enhance the flexibility and usability of document processing in WSO2 Micro Integrator Studio.

Approach

The solutions will be implemented as follows:

--Integrating GPT Vision to enhance the accuracy and intelligence of the document processing capabilities. --Developing the mediator to support both image and PDF inputs, enabling it to handle a variety of document types. --Implementing customizable schema support, allowing users to define the desired output structure using JSON or XSD. --Ensuring the mediator can extract and return data in a structured JSON format

User stories

As a user, I want to process bank statements and extract the relevant data into JSON format so that I can easily manage and analyze the information. As a user, I want the mediator to support both image and PDF inputs so that I can process different types of documents. As a user, I want to customize the output schema to fit my specific needs, ensuring the extracted data is in the desired format.

Release note

This release introduces a new mediator for WSO2 Micro Integrator Studio after 4.3.0 MI release, capable of processing documents with enhanced accuracy using GPT Vision. The mediator supports both image and PDF inputs, and allows users to customize the output schema using JSON or XSD.

Automation tests

Unit tests

Test environment

MI Debugger,Open JDK11

CLAassistant commented 6 days ago

All committers have signed the CLA.

GDLMadushanka commented 5 days ago

duplicate

wso2 / micro-integrator