open-webui / open-webui

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
https://openwebui.com
MIT License
48.52k stars 5.93k forks source link

enh: Notification and Handling for Failed Embeddings #6311

Open bgeneto opened 1 month ago

bgeneto commented 1 month ago

Feature Request: Notification and Handling for Failed Embeddings in Open-WebUI

Description:

Currently, Open-WebUI displays a spinning progress bar during document uploads (embedding process). However, if an embedding fails for any reason (408, 429, 502, 504... whatever) the document is still uploaded, resulting in an empty context (empty vector database) without any user notification. This lack of feedback creates a frustrating user experience, as uploaded documents become unusable for RAG queries without any indication of the underlying issue.

Proposed Solution:

Implement a system to handle failed embeddings more robustly, including:

  1. Clear Error Notification: Display a notification error message to the user if an embedding fails. This message should clearly indicate that the embedding process was unsuccessful and the document's context remains empty.
  2. Prevent Upload on Failure: Optionally, provide a setting to prevent document uploads if the embedding process fails. This would ensure that only documents with successfully generated embeddings are added to the database.
  3. Detailed Error Logging: Log detailed information about the embedding failure, including the specific error encountered, the document being processed, and any relevant parameters. This information can be invaluable for troubleshooting and identifying the root cause of the issue.

Benefits:

Additional Considerations:

This feature is crucial for ensuring a smooth and reliable user experience with Open-WebUI, particularly when relying on RAG functionality. By addressing failed embeddings proactively, we can prevent silent errors and empower users to manage their documents effectively.

Is your feature request related to a problem? Please describe. I'm always frustrated when I use RAG to query a document and the context is empty because the embedding process failed without being logged, notified, etc...

tjbck commented 1 month ago

This might be addressed with 0.3.35, testing wanted here!

bgeneto commented 1 month ago

This might be addressed with 0.3.35, testing wanted here!

Did a quick test with v0.3.35 and I'm still getting this error (below) logged but still nothing appears in the web ui. The document seems to be uploaded successful (but it hasn't). Steps to reproduce:

ERROR [open_webui.apps.retrieval.main] 'NoneType' object is not iterable
Traceback (most recent call last):
  File "/app/backend/open_webui/apps/retrieval/main.py", line 734, in save_docs_to_vector_db
    embeddings = embedding_function(
                 ^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/apps/retrieval/utils.py", line 308, in <lambda>
    return lambda query: generate_multiple(query, func)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/apps/retrieval/utils.py", line 303, in generate_multiple
    embeddings.extend(func(query[i : i + embedding_batch_size]))
TypeError: 'NoneType' object is not iterable
Collection file-a58fa935-a11f-45ee-a331-24cece1541bf does not exist.
400 Client Error: Bad Request for url: http://litellm:4000/v1/embeddings