getomni-ai / zerox

PDF to Markdown with vision models
https://getomni.ai/ocr-demo
MIT License
5.85k stars 309 forks source link

FEAT: Introducing Python SDK #10

Closed wizenheimer closed 3 months ago

wizenheimer commented 3 months ago

Description

This PR introduces the Python SDK for Zerox, aligning with the existing TypeScript SDK.

Changes Made

  1. Directory and API Structure: Reorganized to maintain consistency with the TypeScript SDK. Ensured the external API and types remains unchanged.
  2. Async Execution and Libraries: Implemented async execution with aiofiles and aiohttp, and replaced pdf2pic with pdf2image for PDF processing.
  3. Code Improvements: Decoupled configuration from code. Adopted a class-based structure. Added custom exceptions for user-facing messages among others.
  4. Model Design: Used multiton and registry patterns for the base model, deriving the OpenAI model from it. This aims to reuse.
  5. Build System: Updated the Makefile for the new directory structure. Integrated black and ruff for automated code formatting and linting.

Tests

  1. Verified parity with TypeScript SDK using examples from the examples directory.
  2. Ensured Makefile correctly sets up the development environment and builds the package.

Documentation

  1. No new documentation has been added at this time.

Additional Notes

  1. The changes are intended to validate the proposed implementation.
  2. Usage instructions

    from py_zerox import zerox
    file_path = "examples/cs101.pdf"
    output_dir = "examples/output"
    openai_api_key = "sk-XXXXXXX"
    result = await zerox.zerox(file_path=file_path, openai_api_key=openai_api_key, output_dir=output_dir)
    from py_zerox import zerox
    file_path = "https://omni-demo-data.s3.amazonaws.com/test/cs101.pdf"
    output_dir = "examples/output"
    openai_api_key = "sk-XXXXXXX"
    result = await zerox.zerox(file_path=file_path, openai_api_key=openai_api_key, output_dir=output_dir)
tylermaran commented 3 months ago

Amazing @wizenheimer. Taking a look at this today.

tylermaran commented 3 months ago

Tested locally and it looks good!

I'd like to publish this as a pip package. Seems like all the package config is handled in the setup.cfg. Is there anything else you think we need before publishing?

Also looks like the zerox pip package name is taken. Empty project though so they might be willing to let us have it. Otherwise we can do py_zerox or something.

guici123 commented 2 months ago

Can you write a Python installation tutorial