Closed fbellame closed 7 months ago
Thank you for reporting this. Without testing it, yet, I'll quickly mention the ENV var H2O_WAVE_ADDRESS (+H2O_WAVE_LISTEN), maybe it can already unblock you, when setting this inside the docker container.
Thank a lot, it helped a little bit but now I have a new error message in the log on the containe:
{"err":"websocket: request origin not allowed by Upgrader.CheckOrigin","t":"socket_upgrade"}
Look like an other config is required to allow web socket to work!
H2O Wave recently added a feature that allows configuration of the websocket origins. https://github.com/h2oai/wave/pull/2279
Please check the latest H2O LLM Studio version that includes this feature of H2O Wave. The new env variable for the setting is H2O_WAVE_ALLOWED_ORIGINS
.
Please reopen if that didn't resolve your issue.
🐛 Bug
RunPod is a popular non expensive cloud GPU. Actual doc. I want to build a tutorial on how to easily fine tune a small LLM like Mistral 7b without owning a GPU. I love LLM Studio cause you can do it pretty easily. I own myseft a pretty good GPU but most of the folks (developer not in Data science) don't own one. So I decided to try to deploy LLM Studio with Runpod. It didn't work so I reached the Runpod support that told me that it is a requirement to start the server with 0.0.0.0 and not localhost.
I run through the LLM studio documentation and also a little bit the open source code but didn't manage to find a way to configure that.
Here is the scenario:
Deploying llm studio with docker image on RunPod with docker image fail because web server start on localhost instead of 0.0.0.0 - Should be configurable
To Reproduce
Deploy a RunPod container with docker image
Put your Runpod Key in RUNPOD_KEY (Need a Runpod account)
curl --request POST \ --header 'content-type: application/json' \ --url "https://api.runpod.io/graphql?api_key=${RUNPOD_KEY}" \ --data '{"query": "mutation { podFindAndDeployOnDemand( input: { cloudType: ALL, gpuCount: 1, volumeInGb: 50, containerDiskInGb: 40, gpuTypeId: \"NVIDIA GeForce RTX 3080\", name: \"h2o-llmstudio\", imageName: \"gcr.io/vorvan/h2oai/h2o-llmstudio:nightly\", dockerArgs: \"\", ports: \"10101/http\", volumeMountPath: \"/data\" } ) { id imageName env machineId machine { podHostId } } }"}'
Deployment looks successful but when trying to access the app with the URL:
https://[pod-id]-10101.proxy.runpod.net (change pod-id by the pod id you just deployed)
Generate this error on the browser:
Disconnected. Reconnecting in 16s Make sure your wave server is running and the environment network policies allow websocket connections
LLM Studio version
Any recent version, I use nightly docker image build: gcr.io/vorvan/h2oai/h2o-llmstudio:nightly