h2oai / h2o-llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
https://h2o.ai
Apache License 2.0
4k stars 417 forks source link

deploying llm studio with docker image on RunPod with docker image fail because web server start on localhost instead of 0.0.0.0 - Should be configurable #557

Closed fbellame closed 7 months ago

fbellame commented 10 months ago

🐛 Bug

RunPod is a popular non expensive cloud GPU. Actual doc. I want to build a tutorial on how to easily fine tune a small LLM like Mistral 7b without owning a GPU. I love LLM Studio cause you can do it pretty easily. I own myseft a pretty good GPU but most of the folks (developer not in Data science) don't own one. So I decided to try to deploy LLM Studio with Runpod. It didn't work so I reached the Runpod support that told me that it is a requirement to start the server with 0.0.0.0 and not localhost.

I run through the LLM studio documentation and also a little bit the open source code but didn't manage to find a way to configure that.

Here is the scenario:

Deploying llm studio with docker image on RunPod with docker image fail because web server start on localhost instead of 0.0.0.0 - Should be configurable

To Reproduce

Deploy a RunPod container with docker image

Put your Runpod Key in RUNPOD_KEY (Need a Runpod account)

curl --request POST \ --header 'content-type: application/json' \ --url "https://api.runpod.io/graphql?api_key=${RUNPOD_KEY}" \ --data '{"query": "mutation { podFindAndDeployOnDemand( input: { cloudType: ALL, gpuCount: 1, volumeInGb: 50, containerDiskInGb: 40, gpuTypeId: \"NVIDIA GeForce RTX 3080\", name: \"h2o-llmstudio\", imageName: \"gcr.io/vorvan/h2oai/h2o-llmstudio:nightly\", dockerArgs: \"\", ports: \"10101/http\", volumeMountPath: \"/data\" } ) { id imageName env machineId machine { podHostId } } }"}'

Deployment looks successful but when trying to access the app with the URL:

https://[pod-id]-10101.proxy.runpod.net (change pod-id by the pod id you just deployed)

Generate this error on the browser:

Disconnected. Reconnecting in 16s Make sure your wave server is running and the environment network policies allow websocket connections

LLM Studio version

Any recent version, I use nightly docker image build: gcr.io/vorvan/h2oai/h2o-llmstudio:nightly

pascal-pfeiffer commented 10 months ago

Thank you for reporting this. Without testing it, yet, I'll quickly mention the ENV var H2O_WAVE_ADDRESS (+H2O_WAVE_LISTEN), maybe it can already unblock you, when setting this inside the docker container.

https://wave.h2o.ai/docs/configuration#h2o_wave_address

fbellame commented 10 months ago

Thank a lot, it helped a little bit but now I have a new error message in the log on the containe:

{"err":"websocket: request origin not allowed by Upgrader.CheckOrigin","t":"socket_upgrade"}

Look like an other config is required to allow web socket to work!

pascal-pfeiffer commented 7 months ago

H2O Wave recently added a feature that allows configuration of the websocket origins. https://github.com/h2oai/wave/pull/2279

Please check the latest H2O LLM Studio version that includes this feature of H2O Wave. The new env variable for the setting is H2O_WAVE_ALLOWED_ORIGINS.

Please reopen if that didn't resolve your issue.