Description: Enterprise-grade document analytics platform that combines automated PDF parsing, vector embeddings, and LLM integration with manual annotation capabilities to enable sophisticated document management and extraction of insights at scale. Apache-2 Licensed, it provides several key features:
Layout Parser - Automatically extracts layout features from PDFs using open source nlm-ingest.
Automatic Vector Embeddings - generated for uploaded PDFs and extracted layout blocks and stored via pgvector Django integration.
Pluggable microservice analyzer architecture - to let you analyze documents and automatically annotate them
Human Annotation Interface - React GUI lets you manually annotate documents, including multi-page annotations.
LlamaIndex Integration - Use our vector stores (powered by pgvector) and any manual or automatically annotated features to let an LLM intelligently answer questions.
Data Extract - ask multiple questions across hundreds of documents using complex LLM-powered querying behavior. Our sample implementation uses LlamaIndex + Marvin.
Custom Data Extract - Custom data extract pipelines can be used on the frontend to query documents in bulk.
Additional Information
Reasonably good backend coverage with ongoing efforts to increase the coverage score with each PR :-)
Good, sustained community growth
GitHub Action CI/CD with automated coverage checks, automated tests and automated code styling
name: Add a new project about: Suggest a new Django project for the Awesome Django list title: "[NEW] OpenContracts"
Project Information
Project Name: OpenContracts
Project URL: https://github.com/JSv4/OpenContracts
Description: Enterprise-grade document analytics platform that combines automated PDF parsing, vector embeddings, and LLM integration with manual annotation capabilities to enable sophisticated document management and extraction of insights at scale. Apache-2 Licensed, it provides several key features:
Criteria
Is the project new?
How long has the project been maintained? 3 years
How many releases has it had if it's a library or package? https://github.com/JSv4/OpenContracts/releases
Are you the author or are you submitting the project on behalf of a company?
What makes it awesome?
Key features:
Additional Information