Closed g-eoj closed 2 months ago
Duplicate of https://github.com/h2oai/wave/issues/2043. We had a Slack discussion wrt to this and learned that MLOPs is deployed in such a way that the wave app pod is starved (due to DAI IIRC). Since waved sees wave app as unreachable (gets RST packet), it drops the app as it's considered dead.
This behavior can be altered by setting H2O_WAVE_KEEP_APP_LIVE
env var, but Wave team's recommendation is to fix the deployment to guarantee that pod has at least minimal resources in order for the process to be not suspended (and not send RST packet).
Wave SDK Version, OS
Wave 1.1.1, H2O Cloud app store
Actual behavior
If a message like
2024/03/28 14:19:14 # {"error":"request failed: Post \"http://127.0.0.1:8000\": read tcp 127.0.0.1:35236-\u003e127.0.0.1:8000: read: connection reset by peer","host":"http://127.0.0.1:8000","route":"/","t":"app"}
appears in the logs, the app gets stuck:Refreshing the page or opening a new window does not fix it:
I do not have a simple repro. I can share that continually clicking a button that triggers the following code, where
q.client.deployment
is a H2O MLOps Python client ref, will cause the error: