Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
GNU General Public License v3.0
4.37k
stars
350
forks
source link
Added an async client, table extraction from markdown to list of list with headers #2
Closed
AswanthManoj closed 3 weeks ago
PR: Add AsyncOmniParse Client, Table Extraction, and Pydantic Models
This PR introduces several enhancements to our OmniParse project:
AsyncOmniParse Client:
AsyncOmniParse
for interacting with the OmniParse server.Table Extraction from Markdown:
extract_markdown_tables
: Extracts raw markdown tables from a string.markdown_to_tables
: Converts markdown tables to a list ofTableObj
instances.Pydantic Models:
ImageObj
: Represents extracted images with name, binary data, and MIME type.TableObj
: Represents extracted tables with headers and data.MetaData
: Stores metadata about parsed documents.ParsedDocument
: Represents the complete parsed document with markdown content, images, tables, and metadata.These additions provide a robust, type-safe, and asynchronous interface for interacting with OmniParse.