The main purpose of this project is to develop a new mediator for WSO2 Micro Integrator Studio that can process documents with higher accuracy and intelligence using GPT Vision. This mediator aims to address the need for efficient extraction and conversion of data from documents like bank statements and application forms into a more manageable JSON format. This resolves issues related to manual data extraction and inaccurate document processing.
Goals
The mediator will provide solutions to:
--Accurately process and extract data from various document formats, including images and PDFs.
--Convert extracted data into JSON format for easy handling and manipulation.
--Allow users to customize the output by providing schema definitions in JSON or XSD formats.
--Enhance the flexibility and usability of document processing in WSO2 Micro Integrator Studio.
Approach
The solutions will be implemented as follows:
--Integrating GPT Vision to enhance the accuracy and intelligence of the document processing capabilities.
--Developing the mediator to support both image and PDF inputs, enabling it to handle a variety of document types.
--Implementing customizable schema support, allowing users to define the desired output structure using JSON or XSD.
--Ensuring the mediator can extract and return data in a structured JSON format
User stories
As a user, I want to process bank statements and extract the relevant data into JSON format so that I can easily manage and analyze the information.
As a user, I want the mediator to support both image and PDF inputs so that I can process different types of documents.
As a user, I want to customize the output schema to fit my specific needs, ensuring the extracted data is in the desired format.
Release note
This release introduces a new mediator for WSO2 Micro Integrator Studio after 4.3.0 MI release, capable of processing documents with enhanced accuracy using GPT Vision. The mediator supports both image and PDF inputs, and allows users to customize the output schema using JSON or XSD.
Purpose
Goals
Approach
--Integrating GPT Vision to enhance the accuracy and intelligence of the document processing capabilities. --Developing the mediator to support both image and PDF inputs, enabling it to handle a variety of document types. --Implementing customizable schema support, allowing users to define the desired output structure using JSON or XSD. --Ensuring the mediator can extract and return data in a structured JSON format
User stories
Release note
Automation tests
Test environment