yousefmrashad / bookipedia

AI Inference Server for our graduation project (Bookipedia App)
GNU General Public License v3.0
8 stars 2 forks source link

BOOKIPEDIA-AI

RAG, OCR, TTS and More

Part of the (Bookipedia App) a graduation project at CSED MU. [![Frontend](https://img.shields.io/badge/frontend-02569B?logo=flutter&logoColor=white)](https://github.com/nadahossamismail/Bookipedia) [![Backend](https://img.shields.io/badge/backend-339933?logo=nodedotjs&logoColor=white)](https://github.com/mhmadalaa/bookipedia) [![License: GPL-3.0](https://img.shields.io/badge/license-GPLv3.0-orange.svg)](https://www.gnu.org/licenses/gpl-3.0) [![CodeFactor](https://www.codefactor.io/repository/github/yousefmrashad/bookipedia/badge/main)](https://www.codefactor.io/repository/github/yousefmrashad/bookipedia/overview/main)

Overview

Bookipedia is an online library with an AI powered reading assistant, revolutionizing the digital reading experience by providing users with insights and answers to their questions about their books and documents, summaries and even web research, all through a natural-language chat. In addition, users can upload scanned documents to convert them to readable PDFs and use text to speech to set back and listen to their favourite books.


Features

Core Technologies

Functional Features


Modules

preprocessing | File | Summary | | --- | --- | | [embedding.py](preprocessing/embedding.py) | Interfaces for AnglE and Hugging Face sentence transformers.Interfaces support embedding documents and queries, enabling efficient text representation for downstream tasks. | | [ocr.py](preprocessing/ocr.py) | OCR performs optical character recognition on a PDF document, extracting text and converting it into an editable format. It employs image filtering, skew correction, and OCR techniques to enhance accuracy. The resulting text is exported as an XML file and converted into a PDF with HOCR annotations for easy text retrieval. | | [document.py](preprocessing/document.py) | Document processing orchestrates the transformation of raw documents into structured data. It leverages OCR for scanned documents and converts to markdown text-based ones, splits them into chunks, generates embeddings, and stores them in the vector database for efficient retrieval. |
rag | File | Summary | | --- | --- | | [web_researcher.py](rag/web_researcher.py) | WebResearchRetriever facilitates web research by utilizing DuckDuckGos Search API to retrieve relevant webpages. It employs an LLM to generate search queries, searches for URLs, and indexes new webpages into a vector store. The retriever then searches for relevant document splits within the vector store, ensuring unique and pertinent results. | | [rag_pipeline.py](rag/rag_pipeline.py) | RAGPipeline orchestrates the retrieval and summarization of information from a Weaviate vector database and the web. It generates retrieval queries, combines context from multiple sources, and produces answers to user questions using a large language model. The pipeline also updates the chat summary based on user interactions, ensuring natural conversation flow, and implements page summarization functionality. | | [weaviate_retriever.py](rag/weaviate_retriever.py) | Weaviate Retriever facilitates hybrid searches, combining both semantic and keyword searching by leveraging a vector store to retrieve relevant documents based on a given query. It offers advanced features like auto-merging and re-ranking to enhance search accuracy. | | [web_weaviate.py](rag/web_weaviate.py) | Integrates with Weaviate vector store, enabling text embedding and similarity search for web retrieval tasks. | |
utils | File | Summary | | --- | --- | | [font.py](utils/font.py) | Font manipulation empowers the hOCR module to encode text, estimate its width, and register fonts within a PDF document. It provides a glyphless font for placeholder text and a Courier font for standard text rendering. | | [hocr.py](utils/hocr.py) | This code transforms documents from the hOCR format into PDF files, preserving the original texts position and orientation. It also provides debugging options to visualize the bounding boxes and baselines of text elements, aiding in the verification of the transformations accuracy. | | [init.py](utils/init.py) | Centralizes imports for utility modules, facilitating code organization and reusability within the Bookipedia repository. | | [config.py](utils/config.py) | Configures essential settings and constants for the Bookipedia repository. It establishes root paths, imports necessary modules, defines constants, and sets up models and URLs for various functionalities, including OCR, TTS, document loading, embeddings, the LLM and backend API calls. | | [db_config.py](utils/db_config.py) | Configures and manages the Weaviate database connection, ensuring its existence and proper schema. | | [functions.py](utils/functions.py) | Provides utility functions for OCR, document loading, retrieval filtering, and text processing to support document processing and retrieval. Key features include image value scaling, token counting, filtering by IDs and page numbers, merging text chunks with overlap handling, and calculating the percentage of document area covered by images. | |
api | File | Summary | | --- | --- | | [api.py](api/api.py) | This API serves as the core inference engine for the Bookipedia application, providing AI-powered document processing, chat response generation, text-to-speech synthesis, and page summarization. It integrates seamlessly with the application's architecture, ensuring efficient and scalable AI inference. The API supports background tasks for document and chat processing, enabling asynchronous operations and enhanced performance. | | [schemas.py](api/schemas.py) | This file establishes the structure of request bodies for the API, ensuring consistent and well-formed data input. It defines schemas for chat parameters and text-to-speech requests, facilitating seamless communication between the API and its clients. | |

Getting Started

System Requirements:

Installation

From source

  1. Clone the bookipedia repository:
$ git clone https://github.com/yousefmrashad/bookipedia
  1. Change to the project directory:

    $ cd bookipedia
  2. Install the dependencies:

    $ pip install -r requirements.txt

Usage

From source

Run the server using the command below:

$ python api/api.py

Models and Frameworks

AI Models

Tools and Frameworks


License

This project is protected under the GPL-3.0 License, unless otherwise specified. Certain files may have separate licenses, which can be found in the LICENSES folder.


Team

This part of the project was made possible with the equal contributions of:

And our front-end and back-end teams!


Return