Enable Vision Models window in the UI via docker

h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/

http://h2o.ai

Apache License 2.0

11.23k stars 1.23k forks source link

Enable Vision Models window in the UI via docker #1420

Closed vitalyshalumov closed 6 months ago

vitalyshalumov commented 6 months ago

Please provide guidelines for enabling Vision Models window in the UI for image upload.

pseudotensor commented 6 months ago

The FAQ for launching llava and arguments for h2ogpt are the same.

https://github.com/h2oai/h2ogpt/blob/main/docs/FAQ.md#llava-vision-models

Docker arguments are nothing special, but one should ensure to pass through ports gradio llava uses, e.g. 7861 in the above FAQ example. Options like --llava_model need to be passed along.

If these two things (port and llava_model) are not clear, let me know.

vitalyshalumov commented 6 months ago

If I use TGI, how to specify the inputs?

pseudotensor commented 6 months ago

Did you try the 2 things I mentioned? llava stuff is unrelated to TGI.

vitalyshalumov commented 6 months ago

I'm trying to use TGI for llava inference, as I would for any other model. Besides the --llava_model option, how is this different from LLM usecases?

pseudotensor commented 6 months ago

I don't think TGI, vLLM, etc. support llava. I know ollama does, which maybe better esp for smaller systems.