allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.69k stars 655 forks source link

Can't add non-AWS s3 bucket as remote storage #970

Open ernygati opened 1 year ago

ernygati commented 1 year ago

Describe the bug

I want to make dataset of my project to be traced by ClearMl and to be stored on non amazon AWS s3 bucket. Simply repeating the steps in the docs, I do the following:

What I want is to find the way to customize the address, so that it no more amazone aws specific, and looks like https://<bucket>.my_endpoint/<key>?<parameters>? without amazone.aws. Is there a way to achieve it?

Environment

jkhenning commented 1 year ago

Hi @ernygati,

For clarity, this is what was answered in the Slack discussion: If this is a non-s3 the output_uri should look something like: output_uri = "s3://server-ip-here:port/backet" Then make sure you have the correct credentials in your clearml.conf file: https://github.com/allegroai/clearml/blob/a794d0d8abd0763aedce9b4ffb8c787404612f6c/docs/clearml.conf#L113

klekass commented 1 year ago

Hi, I noticed, that adding the output_uri like this, in the ClearML UI the output is also rendered with the IP Address of the S3 server like in the following screenshot:

image

This becomes problematic, once the IP Address of the Storage server changes because the path becomes invalid. Is there a way to disble rendering the server ip in the UI, when using a self-managed s3 storage server?

jkhenning commented 1 year ago

Hi @klekass, ClearML stores the server address (ip or other) as part of the URL since it stores a complete URL. If the address changes, that's indeed an issue - I suggest using a fixed domain name that will mask any IP change