allegroai / clearml-server

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Other
381 stars 131 forks source link

clearml-webserver crashes when IPv6 is disabled on a k8s node #220

Open stephanbertl opened 10 months ago

stephanbertl commented 10 months ago

We have Ubuntu 22.04 k8s nodes with ipv6 disabled in the kernel options.

clearml cannot start on these nodes. Pods are crashing:

kubectl logs clearml-webserver-75bbc647d-zkdlf -n clearml
Defaulted container "clearml-webserver" out of: clearml-webserver, init-webserver (init)
2023/11/13 16:47:03 [emerg] 50#50: socket() [::]:80 failed (97: Address family not supported by protocol)
nginx: [emerg] socket() [::]:80 failed (97: Address family not supported by protocol)

We are using chart version 7.4.0

amishurov commented 8 months ago

found this pr https://github.com/allegroai/clearml-server/pull/165/files

    environment:
      DISABLE_NGINX_IPV6: "true"

works for docker-compose. So setting this variable should work for kubernetes too