Design of Microservices for LlamaIndex-based Application on OpenShift*

Priority: P0 (Critical)
Status: Open
Objective: Finalize the architecture design for LLM-powered microservices with RAG functionality on OpenShift, ensuring modularity, scalability, and fault tolerance.

Description

This ticket outlines the high-level architecture of the microservices for the LlamaIndex-powered application. The design focuses on breaking down the LLM and RAG components into independent microservices, following best practices for OpenShift's containerized environment.

Microservice Architecture Diagram

The architecture will consist of the following seven services. Each will perform a unique role, with API-based interactions, enabling scalability and isolation.

1. Index Management Service

Functionality: Handles index creation, updates, and deletion.
Components: VectorStoreIndex from LlamaIndex.
API Endpoints:
- /create_index
- /update_index
- /delete_index
Dependencies:
- Document Upload Service
- TiDB for index storage

2. Embedding Generator Service

Functionality: Generates embeddings from uploaded documents or queries using SciBERT.
Components: LlamaIndex’s EmbeddingRetriever.
API Endpoints:
- /generate_embeddings
Dependencies:
- Index Management Service
- Document Upload Service

3. Vector Search Service

Functionality: Executes similarity searches on the vector index.
Components: VectorStoreRetriever.
API Endpoints:
- /vector_search
- /re_rank
Dependencies:
- Embedding Generator Service
- TiDB for vector storage

4. Query Engine Service

Functionality: Orchestrates queries, combining results from vector search and metadata filters.
Components: QueryEngine.
API Endpoints:
- /query
- /query_with_filters
Dependencies:
- Vector Search Service
- Metadata Management Service

5. RAG Response Generator Service

Functionality: Synthesizes context-rich responses using retrieved documents and LLM models.
Components: LLMResponseSynthesizer.
API Endpoints:
- /generate_context
- /chat_with_documents
Dependencies:
- Query Engine Service
- Vector Search Service

6. Metadata Management Service

Functionality: Manages document metadata and ensures accurate tagging (e.g., PICO extraction).
Components: StructuredStore.
API Endpoints:
- /update_metadata
- /get_metadata
Dependencies:
- Index Management Service
- RAG Response Generator Service

7. Document Upload & Ingestion Service

Functionality: Manages document uploads, ingestion, and embedding generation pipelines.
API Endpoints:
- /upload_document
- /ingest_document
Dependencies:
- Embedding Generator Service
- Metadata Management Service
- Index Management Service

Deliverables

API Documentation: Swagger/OpenAPI specs for all services.
Containerization: Dockerfiles for each microservice.
Deployment Manifests: YAML files for OpenShift deployment.

This P0 ticket will act as the foundation for designing and implementing the architecture. Once the architecture is finalized, sub-tasks will be created for the development and deployment of each microservice.

Next Steps:

Confirm the architecture design and dependencies.
Generate API documentation with Swagger.
Begin breaking down each service into individual tasks and development milestones.

datagero / pico-scholar