vscode-clearml-session-manager
_The easiest way to connect to ClearML Sessions--one of the best remote workstation offerings in the MLOps space. (comparison of ClearML vs others here)_
💬 We're looking for contributors! See the contributing section below.
ClearML is self-hostable without kubernetes and has a free SaaS-hosted plan, meaning you can get a world-class data science development environment for free.
Items marked with ✨ are high-impact, and important for our first release
Features:
Exploring sessions
queued
, stopped
, in_progress
, etc.clearml-session
args used to create the session including
--queue default
--docker python3.9
--init-script
(the contents)--user
--password
(we should think about how to treat this)Connecting to sessions
settings.json
settings including
clearml-session-manager.clearmlConfigFilePath
(string), defaults to ~/clearml.conf
clearml-session-manager.clearmlConfigFilePath
is not set, and ~/clearml.conf
does not exist, prompt the user with instructions to start their own ClearML backend server and run clearml-init
clearml.sessionPresets
(array of objects), lets you your favorite sets of arguments to the clearml-session
CLILow priority it may be of interest to support connecting to sessions by starting an "interactive session" via
clearml-session --attach <task id>
which creates an SSH tunnel to localhost. This is a different way of connecting to sessions, so this is redundant with the above.
clearml-session
as a subprocess
clearml-session
command somewhere that the user can see (useful for debugging and learning)clearml-session
logsssh root@localhost -p 8022
and the password, e.g. [password: pass]
Creating sessions
+
button that allows you to create a ClearML session
clearml-sessions
. Ideas:launch.json
, basically, have users define presets in a JSON file at .vscode/clearml.json
clearml-session
arguments, e.g. by using a Webview
. Do API calls to provide the user with autocompletion on anything we can, e.g. for which queues are availableDevOps
docker-compose.yaml
and instructions for hosting ClearML locally for development. Here's their official reference compose file.docker-compose.yaml
and make API calls to itpackage.json
, pushing tags on mergeVS Code makes it really easy to run extensions and try out code changes:
src/extension.ts
file and press F5
to start a debugging sessionHere are a few videos with progress updates. Watching these will step you through how we learned about authoring VS Code extensions and how we got to where we are now.
vscode-black-formatter
extension~/clearml.conf
file with TypeScript📌 Note: As a first contribution, it'd be great if you submitted a PR to this README if you get stuck during setup.
~/clearml.conf
file by running the clearml-init
command and pasting them in.
Settings
> Workspace
> + Create new credentials
in the UI to create a key pair. But it should automataically do this for you when you first log in. If you choose an EC2 instance or Amazon Lightsail (cheaper than EC2), you can
do something like this to install and start up the clearml-agent daemon
which
will allow you to run ClearML Session on it:
First, connect to it via SSH
ssh ec2-user@<public ip of instance>
Then install and start the clearml-agent daemon
sudo su
yum update -y && yum install -y docker docker-compose python3-pip
service docker start
python -m pip install clearml clearml-agent
clearml-agent init # paste in your API keys (you can also make a new pair for this)
# exapmle daemon command, you can set these however you like
clearml-agent daemon --queue default --docker --cpu-only --log-level debug
On your laptop, you'd run something like
clearml-session --queue default --docker python:3.9
Note that this will open an SSH tunnel to your instance, so your instance needs
to accept incoming traffic on port 22
and probably others like 10022
.
If this is an EC2 instance, this means adjusting your security group. Other clouds have a similar concept. They often call this a "firewall".
Follow steps 1, 2, 3, and 6 below.
⚠️ Disclaimer: while getting this running locally is a good onboarding exercise, the last step currently fails (For Eric, at least) mid-way through creating the session. If you want to pursue troubleshooting this, by all means!
Unfortunately, for the sake of continuing development, the most reliable course is to provision your own VM in the cloud e.g. an EC2 instance, Digital Ocean droplet, Linode server, etc.
⚠️ Disclaimer: expect problems if you try to run this project directly on Windows.
Install the Windows Subsystem for Linux 2 (WSL2) and develop from there if you are running windows.
The free videos in the
Environment Setup
section of this course walk you through how to do this, as well as most of step [1] below.
docker
, on MacOS and Windows (including WSL2), get Docker Desktopdocker-compose
, e.g. brew install docker-compose
brew install nodejs
pyenv
is a good way to install Python.# cd into the cloned repo and install the NodeJS dependencies
cd ./vscode-clearml-session-manager/
npm install
./dev-utils/volumes/opt/clearml/config/clearml.conf
npm run create-clearml-credentials
npm run start-clearml-server
./src/extension.ts
and pressing F5
on your keyboardThe extension should load successfully, but it won't have any sessions. To start a session, run
# install the clearml-session CLI into a Python virtual environment
python -m venv ./venv/
source ./venv/bin/activate
npm run install-python-deps
# execute the clearml-session CLI to start a session
npm run start-clearml-session
This will take some time to run. While it loads, you should be able to visit
http://localhost:8080 and visit the DevOps
folder in Clearml after logging in
with username: test
, password: test
.
graph LR
subgraph backend_network["Backend Network"]
redis[("Redis<br>Port: N/A")]
mongo[("MongoDB<br>Port: N/A")]
elasticsearch[("Elasticsearch<br>Port: N/A")]
fileserver[("Fileserver<br>Port: 8081")]
minio[("MinIO<br>Port: 9000")]
apiserver[("API Server<br>Port: 8008")]
async_delete[("Async Delete<br>Port: N/A")]
agent_services[("Agent Services<br>Port: N/A")]
end
subgraph frontend_network["Frontend Network"]
webserver[("Webserver<br>Port: 80<br>URL: app.clearml.localhost:8080")]
end
user -->|HTTP:8080| webserver
webserver -->|HTTP:8008| apiserver
apiserver -->|Internal| mongo
apiserver -->|Internal| redis
apiserver -->|Internal| elasticsearch
apiserver -->|HTTP:8081| fileserver
apiserver -->|HTTP:9000| minio
fileserver -->|HTTP:9000| minio
async_delete -->|Internal| apiserver
agent_services -->|Internal| apiserver
style webserver fill:#f9f,stroke:#333,stroke-width:2px
Service | URL | Notes |
---|---|---|
UI | http://app.clearml.localhost:8080 | user: test pass test |
API Server | http://api.clearml.localhost:8008 | |
Fileserver | http://files.clearml.localhost:8081 | |
MinIO UI | http://minio.clearml.localhost:9001 | user: minioadmin pass: minioadmin |
apiserver
): api.clearml.localhost:8008
Interacts with MongoDB, Redis, Elasticsearch, Fileserver, and MinIO. It can be accessed internally by other services within the backend_network
.webserver
): app.clearml.localhost:8080
The main entry point for users in the browser. It can be visited at .fileserver
): files.clearml.localhost:8081
Serves files and communicates with MinIO on port 9000
. It's reachable internally by the API server at port 8081
.minio
): The file storage server replacing the built-in fileserver. It listens on port 9000
and can be accessed internally by both the Fileserver and the API server.elasticsearch
), MongoDB (mongo
), and Redis (redis
): These services do not have ports exposed outside but can be accessed by the API server and other services within the backend_network
.async_delete
): It's an internal service that connects to the API server and depends on MongoDB, Redis, and Elasticsearch.agent_services
): Interacts with the API server and does not expose any ports outside.Ports mentioned as "N/A" are not directly exposed to the host machine but are used internally within the Docker network for service communication.
Remember to replace the example URL with the actual domain and port you will use in your production or development environment. The above diagram assumes that services like MongoDB, Redis, and Elasticsearch do not need to be accessed directly through a browser and therefore do not have a URL associated with them for external access.