weaviate / Verba

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
BSD 3-Clause "New" or "Revised" License
6.4k stars 692 forks source link

Add Upstage AI Integration (Document Parse, Embeddings, and Generation) #314

Open hunkim opened 3 weeks ago

hunkim commented 3 weeks ago

This PR introduces comprehensive integration with Upstage AI services, adding three new components.

They have high performance. (For more information, visit https://www.upstage.ai/ and https://www.upstage.ai/products/document-parse)

image

image
  1. Document Processing:

    • Added UpstageDocumentParseReader for converting documents to structured HTML format
    • Integrated with Upstage Document AI API for enhanced document parsing
  2. Embedding Support:

    • Added UpstageEmbedder supporting 'embedding-query' and 'embedding-passage' models
    • Implemented vector generation for both documents and queries
  3. Text Generation:

    • Added UpstageGenerator supporting Solar models (solar-pro, solar-mini)
    • Implemented streaming response generation with context handling

Configuration Updates:

The integration is available in both production and development environments, maintaining consistent API patterns with existing integrators.

Testing:

Dependencies:

Documentation:

This integration enhances Verba's capabilities with state-of-the-art document processing, embedding, and generation features from Upstage AI.

weaviate-git-bot commented 3 weeks ago

To avoid any confusion in the future about your contribution to Weaviate, we work with a Contributor License Agreement. If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge.

beep boop - the Weaviate bot 👋🤖

PS:
Are you already a member of the Weaviate Slack channel?

hunkim commented 3 weeks ago

If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge.

I agree. Thanks!!