Closed saitej123 closed 1 day ago
I'd love to support a Python API and publish a package on pip. Right now neither of the maintainers are super good python devs, but if you know anyone would would want to make a contribution let us know!
The roadmap so far is:
@tylermaran I've been thinking of building a version of this for myself for a while now and I was so excited to see your project on HN so that I didn't have to build it myself haha
Let me look into this. Maybe I can help out with a python package
Hey @batmanscode 🦇
I would love to have some help here. It looks like there is a similar pip package for pdf2image
.
https://github.com/Belval/pdf2image
Uses poppler
under the hood. I wonder if there's a variant that uses imagemagik
like the current node version does. But either way it should be pretty easy to set up. Within the npm setup we have an install-dependencies script to make sure all the prereqs are set up.
I'd like to keep this as a monorepo if possible. Probably something like:
zerox/
├── .gitignore
├── README.md
├── LICENSE
├── package.json # npm config
├── setup.py # pip config
├── node-zerox/ # typescript source
│ ├── src/
│ ├── dist/
│ ├── tests/
│ └── etc/
└── py-zerox/ # python source
├── src/
├── tests/
└── etc/
Hey @tylermaran and @batmanscode, This looks interesting, would love to collaborate. I have experience with both TypeScript and Python package development.
Have reviewed zerox source, can assist in replicating it to Python. My goal would be to ensure that the API and build process remain consistent across both the TypeScript and Python implementations.
Looking forward to working together!
Hey @tylermaran, Quick update. Prepared a PR #4 which presents the monorepo structure for Zerox. This includes Poetry for dependency management, a Makefile for build automation, and some code quality checks. Current implementations are placeholders. The actual implementation details will be added once the proposed structure gets reviewed and approved :D
Can gpt4 mini provide bounding box details also ? If I want to highlight key information in document
@saitej123 I've been looking into this as well. It doesn't seem to be immediately available using gpt-4o-mini.
I know it's possible to use a library like YOLOv8 to grab bounding boxes. But that get's a little harder when you have to host an additional model.
I think the general flow would be:
This is a bit separate from the python request, so I added a tracking issue #7
If we use azure ocr or gcp we can map bounding box not sure mapping may fail it split in different way
@wizenheimer merged your repo updates for the python package in #4
Great work. Now we just need to add the core logic.
Hey @tylermaran, Added the PR #10 introducing Python SDK for Zerox. Ensured the external API and types remain consistent across the SDKs.
Could you add a usage section for python in the README?
Could you add a usage section for python in the README?
Could you add a usage section for python in the README? @tylermaran
@guici123, @RazvanMihaiPopa have a look at this PR https://github.com/getomni-ai/zerox/pull/21, should be useful.
Please provide Python API