allegroai / clearml-agent

ClearML Agent - ML-Ops made easy. ML-Ops scheduler & orchestration solution
https://clear.ml/docs/
Apache License 2.0
236 stars 91 forks source link

Agent doesnt work with https for the api, reverse proxy #145

Open JeremyMahieu opened 1 year ago

JeremyMahieu commented 1 year ago

When using the following config, where both the api and the web server are on https, the agent gives 0 output, no errors, nothing. Even when doing --foreground and running it as verbose. At most it will say Using environment access key..... Same when running deamon, or execute mode.

  api {
      api_server: https://example-api.com:443
      web_server: https://example.com:443
      files_server: https://example-files.com:443
  }

However when we put the api server on http, it seems to work, it finished experiments from the queue but reports back the wrong URL. For example ClearML results page: http://example.com/projects/<stuff>/experiments/<stuff>/output/log, notice this is on http while the web server is on https. So this url is wrong but otherwise it's working.

api {
    api_server: http://example-api.com:80
    web_server: https://example.com:443
    files_server: https://example-files.com:443
}

The clearml-server is set up as http, none of the SSL fucntions are used. However a reverse proxy handles SSL. Here is the nginx config. Some nginx config for all 3 subdomains.

    server {
        listen  443 ssl http2;
        server_name  example-api.com;
        ssl_certificate     /etc/nginx/tls/live/stuff/fullchain.pem;
        ssl_certificate_key /etc/nginx/tls/live/stuff/privkey.pem;
        ssl_protocols       TLSv1.2;
        ssl_prefer_server_ciphers off;
        root /usr/share/nginx/html;

        location / {
            proxy_pass http://1.2.3.4:5678;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "Upgrade";
            proxy_set_header Host $host;
        }
    }
JeremyMahieu commented 1 year ago

The problem boils down to this. The server thinks it's on http, but its really on https via a proxy. And therefore the agent tries to reach the api on http also.

The protocol part of the url in the config isn't being taken into account.