SciPhi-AI / R2R

The most advanced Retrieval-Augmented Generation (RAG) system, containerized and RESTful
https://r2r-docs.sciphi.ai/
MIT License
3.65k stars 270 forks source link

cleanup #1512

Closed emrgnt-cmplxty closed 2 weeks ago

emrgnt-cmplxty commented 2 weeks ago

[!IMPORTANT] Replaced Zerox parser with vision-based parsers for PDF and image processing, updated ingestion configurations, and refactored related code.

  • Behavior:
    • Removed Zerox parser and related configurations from action.yml, graphrag.mdx, and full.toml.
    • Added vision-based parsers VLMPDFParser and BasicPDFParser for PDF processing in pdf_parser.py.
    • Updated ImageParser in img_parser.py to use vision models for image processing.
    • Updated AudioParser in audio_parser.py to use Whisper transcription.
  • Ingestion and Parsing:
    • Updated IngestionConfig in ingestion.py to include vision model configurations.
    • Refactored R2RIngestionProvider and UnstructuredIngestionProvider to use new parsers.
    • Removed Zerox-related code and dependencies from pyproject.toml.
  • Misc:
    • Added new prompts vision_img.yaml and vision_pdf.yaml for vision-based parsing.
    • Updated R2RAuthProvider in r2r_auth.py to use send_password_reset_email().
    • Minor updates to logging and configuration files.

This description was created by Ellipsis for 760606fc549b6f0db58a5576542f0b66a3659020. It will automatically update as commits are pushed.

vercel[bot] commented 2 weeks ago

The latest updates on your projects. Learn more about Vercel for Git ā†—ļøŽ

Name Status Preview Comments Updated (UTC)
yc_demo āœ… Ready (Inspect) Visit Preview šŸ’¬ Add feedback Oct 28, 2024 11:55pm
yc-demo āœ… Ready (Inspect) Visit Preview šŸ’¬ Add feedback Oct 28, 2024 11:55pm
1 Skipped Deployment | Name | Status | Preview | Comments | Updated (UTC) | | :--- | :----- | :------ | :------- | :------ | | **recommendation_platform** | ā¬œļø Ignored ([Inspect](https://vercel.com/my-team-88dd52c0/recommendation_platform/DdMFLrV5EMzkpWESXnqKdBe1Qz4S)) | | | Oct 28, 2024 11:55pm |