open-webui / pipelines

Pipelines: Versatile, UI-Agnostic OpenAI-Compatible Plugin Framework
MIT License
777 stars 240 forks source link

bug: Chat Hangs Indefinitly When Using Any Filter #100

Open Highsight opened 3 months ago

Highsight commented 3 months ago

Whenever I attempt to add any sort of filter to any model in OpenWebUI, chatting with the model results in an indefinate hang. Pipelines and manifolds do not appear to have this issue, only filters. This lasts until I shut down the Pipelines Docker Container, at which point the model will respond without the filter info. I have tried this with multiple filters, including detoxify_filter_pipeline.py, llm_translate_filter_pipeline.py and home_assistant_filter.py.

I am running pipelines:latest-cuda and open-webui:dev-cuda. I have tried also just using open-webui:cuda and pipelines:latest with no difference. My machine utilizes and NVidia RTX 2060. I am using the llama3:8b model.

For the Home Assistant Filter I have the following values set:

I'd appreciate any assistance in figuring this out, the Filter technology seems very interesting and I'd like to get to know it better.

pressdarling commented 3 months ago
  • Openai Api Base Url: http://host.docker.internal:9099
  • Openai Api Key: 0p3n-w3bu!
  • Task Model: llama3:8b

@Highsight I'm not an expert at this but if you've defined the filters' OpenAI API URL as the Pipelines base URL, then you're creating a loop. You're not running llama3:8b within Pipelines!

Try adding another OpenAI API URL/Key pair, and set the second OpenAI Base URL value to the value of OLLAMA_BASE_URL, e.g. http://host.docker.internal:11434, http://ollama:11434, http://localhost:11434 - per the troubleshooting guide depending on how/where you're hosting Ollama. This is separate to the Pipelines one (which you have there now), and the Ollama one.

While I don't know if this will actually fix it - I've had more pressing things than playing with the pipeline filter configurations - this should at least stop the whole thing from hanging. I think there are enough breadcrumbs in the code for those files if you want to figure out a more robust solution.