Closed consideRatio closed 8 months ago
I had a look at this, and its seems it isn't the liveness/readiness probe after all. Not sure what can be done besides looking into why this happens, and if its a problem for everyone or that its a problem caused in my deployment of dask-gateway specifically due to a mistake by me.
It also happens in my deployment, and I agree it clutters the logs and it's not sure whether its presence is a good or bad sign.
Ah okay thats good to know @sebastian-luna-valero, is your deployment made side by side with a jupyterhub installation like mine?
At the moment my deployment is running daskhub-2023.1.0
If the hub
pod isn't running, we don't see the warnings any more. So, its JupyterHub sending these out regularly.
And why? Because of a service health check: https://github.com/jupyterhub/jupyterhub/blob/29bb4b80329636b3a8aba22c9d0401dbca5be3cb/jupyterhub/app.py#L3490-L3496
I'll keep thinking a bit.
So JupyterHub is configured to proxy traffic via /services/dask-gateway to the dask-gateway server, and that is relevant in order to ensure you can access dashboards currently from a browser.
At the same time, JupyterHub automatically runs a health check against that specific destination for the health check. By doing so, it causes the 404 responses.
One course workaround is to set the daskhub chart config jupyterhub.hub.config.JupyterHub.service_check_interval=0
, by doing so the service health check is disabled. At the same time, maybe that is relevant in other situations for other JupyterHub registered services. Possibly not.
I think its out of scope for dask-gateway to adjust to this. So, these are the paths I see reasonable to take:
/api/health
insteadjupyterhub.hub.config.JupyterHub.service_check_interval=0
if they are using the daskhub helm chart that deploys the jupyterhub helm chart and the dask-gateway helm chart side by side.@sebastian-luna-valero I'll go for a close on this, I think its out of scope for dask-gateway to adjust to this - it makes sense to log 404 arriving to /
I think.
https://github.com/jupyterhub/jupyterhub/issues/4637 is opened now though.
Sure, thanks for the insights!
I think maybe we have a health check running for the helm chart's deploment of the dask-gateway server that goes to
/
, and reports 404? From a readiness / health perspective, this can actually be seen as a "all good" response - because to be able to respond with 404 is a sign of life after all.At the same time, this clutters the logs. So, can we avoid it somehow? Should we change where the health check is done, or should we do it to another endpoint that doesn't lead to logging 404 - which is otherwise reasonably relevant to log.