SciPhi-AI / R2R

The most advanced Retrieval-Augmented Generation (RAG) system, containerized and RESTful
https://r2r-docs.sciphi.ai/
MIT License
3.65k stars 271 forks source link

Feature/add document summary to ingestion #1573

Closed emrgnt-cmplxty closed 6 days ago

emrgnt-cmplxty commented 6 days ago

[!IMPORTANT] Adds document summary generation to ingestion process and refactors search settings to SearchSettings.

  • Behavior:
    • Introduces document summary generation during ingestion in ingestion_service.py and ingestion_workflow.py.
    • Adds augment_document_info() to generate summaries using LLM and store embeddings.
    • Updates ingestion status to include AUGMENTING.
  • Search Settings:
    • Renames VectorSearchSettings and DocumentSearchSettings to SearchSettings across multiple files.
    • Updates search methods to use SearchSettings in retrieval_service.py, retrieval_router.py, and vector_search_pipe.py.
  • Database:
    • Modifies PostgresDocumentHandler to include summary and summary_embedding fields.
    • Adds full-text and semantic search capabilities in document.py.
  • Configuration:
    • Updates configuration files to include document summary settings.
    • Adds default_summary.yaml for summary prompt configuration.
  • Misc:
    • Refactors search pipelines and pipes to accommodate new search settings.
    • Updates API models and responses to reflect changes in document search results.

This description was created by Ellipsis for 39c5faeed6c6eb3073f02f8e51288a7f02c17dd4. It will automatically update as commits are pushed.

vercel[bot] commented 6 days ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
yc-demo ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 11, 2024 11:24pm