Securing Aim Remote Tracking server using SSL key and certificate
Hi, first of all I appreciate all the work you've put into making Aim!
I am having some trouble securing the connection to the Aim Remote Tracking (RT) Server, and was wondering if you could help me out.
I recently setup a virtual machine on Azure, which is running both the Aim RT Server and the Aim UI. To do this, I have used a docker-compose.yml, which brings up both the server and the UI. This is working properly, I can log runs from another machine and see them appear in the UI, great.
However, now I want to secure the connection to the remote tracking server using SSL, as described here. I've created a self-signed key and certificate file using openssl, as described here.
Whenever I bring up the server using this command, eveything seems in working order, I do not get any errors etc:
But then when I try to log a run from another machine, I get the following error on the client:
azureuser@ml-ci-jvranken-prd:~/cloudfiles/code/Users/jvranken/aim-tracking-server$ python aim_test.py
Failed to connect to Aim Server. Have you forgot to run `aim server` command?
Traceback (most recent call last):
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/connectionpool.py", line 467, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/connectionpool.py", line 462, in _make_request
httplib_response = conn.getresponse()
File "/anaconda/envs/verhuiskans/lib/python3.10/http/client.py", line 1375, in getresponse
response.begin()
File "/anaconda/envs/verhuiskans/lib/python3.10/http/client.py", line 318, in begin
version, status, reason = self._read_status()
File "/anaconda/envs/verhuiskans/lib/python3.10/http/client.py", line 287, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/connectionpool.py", line 799, in urlopen
retries = retries.increment(
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/util/retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/packages/six.py", line 769, in reraise
raise value.with_traceback(tb)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/connectionpool.py", line 467, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/urllib3/connectionpool.py", line 462, in _make_request
httplib_response = conn.getresponse()
File "/anaconda/envs/verhuiskans/lib/python3.10/http/client.py", line 1375, in getresponse
response.begin()
File "/anaconda/envs/verhuiskans/lib/python3.10/http/client.py", line 318, in begin
version, status, reason = self._read_status()
File "/anaconda/envs/verhuiskans/lib/python3.10/http/client.py", line 287, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/ext/transport/utils.py", line 14, in wrapper
return func(*args, **kwargs)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/ext/transport/client.py", line 138, in connect
response = requests.get(endpoint, headers=self.request_headers)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/requests/adapters.py", line 682, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/ml-ci-jvranken-prd/code/Users/jvranken/aim-tracking-server/aim_test.py", line 7, in <module>
run = Run(
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/ext/exception_resistant.py", line 70, in wrapper
_SafeModeConfig.exception_callback(e, func)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/ext/exception_resistant.py", line 47, in reraise_exception
raise e
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/ext/exception_resistant.py", line 68, in wrapper
return func(*args, **kwargs)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/sdk/run.py", line 859, in __init__
super().__init__(run_hash, repo=repo, read_only=read_only, experiment=experiment, force_resume=force_resume)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/sdk/run.py", line 272, in __init__
super().__init__(run_hash, repo=repo, read_only=read_only, force_resume=force_resume)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/sdk/base_run.py", line 34, in __init__
self.repo = get_repo(repo)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/sdk/repo_utils.py", line 26, in get_repo
repo = Repo.from_path(repo)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/sdk/repo.py", line 210, in from_path
repo = Repo(path, read_only=read_only, init=init)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/sdk/repo.py", line 121, in __init__
self._client = Client(remote_path)
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/ext/transport/client.py", line 50, in __init__
self.connect()
File "/anaconda/envs/verhuiskans/lib/python3.10/site-packages/aim/ext/transport/utils.py", line 18, in wrapper
raise RuntimeError(error_message)
RuntimeError: Failed to connect to Aim Server. Have you forgot to run `aim server` command?
Do you have any clue as to why this is not working? Here is the docker-compose.yaml and the python file I'm using:
from aim import Run
# AIM_REPO='/home/azureuser/mycontainer/aim'
AIM_REPO='aim://REDACTED:53800'
AIM_EXPERIMENT='SSL-server'
run = Run(
repo=AIM_REPO,
experiment=AIM_EXPERIMENT
)
hparams_dict = {
'learning_rate': 0.001,
'batch_size': 32,
}
run['hparams'] = hparams_dict
# log metric
for i in range(30):
if i % 5 == 0:
i = i * 0.347
run.track(float(i), name='numbers')
@JeroenVranken thanks for the issue. This could be related to the auth token things we have added recently. @mihran113 @alberttorosyan what do you guys think?
Securing Aim Remote Tracking server using SSL key and certificate
Hi, first of all I appreciate all the work you've put into making Aim!
I am having some trouble securing the connection to the Aim Remote Tracking (RT) Server, and was wondering if you could help me out.
I recently setup a virtual machine on Azure, which is running both the Aim RT Server and the Aim UI. To do this, I have used a
docker-compose.yml
, which brings up both the server and the UI. This is working properly, I can log runs from another machine and see them appear in the UI, great.However, now I want to secure the connection to the remote tracking server using SSL, as described here. I've created a self-signed key and certificate file using openssl, as described here.
Whenever I bring up the server using this command, eveything seems in working order, I do not get any errors etc:
But then when I try to log a run from another machine, I get the following error on the client:
Do you have any clue as to why this is not working? Here is the
docker-compose.yaml
and the python file I'm using: