allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.61k stars 651 forks source link

Error accessing GCP artifacts when using special characters in task name #1051

Closed materight closed 1 year ago

materight commented 1 year ago

Describe the bug

When uploading to GCP and using special characters in the task name, e.g. :, the artifact can't be retrived correctly. The returned artifact url is quoted, but GCP requires the url to be unquoted. Adding a call to urllib.parse.unquote(artifact._url) fixes the issue.

To reproduce

import urllib
from clearml import Task

task = Task.init(project_name='TEST', task_name='test:1')
task.upload_artifact('test_artifact', {'a': 1}, wait_on_upload=True)
artifact = task.artifacts['test_artifact']
#artifact._url = urllib.parse.unquote(artifact._url) # Fix
val = artifact.get(force_download=True)
print(val)

Expected behaviour

The artifact should be downloaded correctly. Without the fix, it tries to access: gs://cerrion-clearml/TEST/test%253A1.7c3dc2f40455456e88cf4c154edb965c/artifacts/test_artifact/test_artifact.json while the correct url would be without the %25: gs://cerrion-clearml/TEST/test%3A1.7c3dc2f40455456e88cf4c154edb965c/artifacts/test_artifact/test_artifact.json

Environment

ainoam commented 1 year ago

Thanks for reporting @materight.

Hope to have a fix out soon.

alex-burlacu-clear-ml commented 1 year ago

Hey @materight, we'll release a fix for this issue in the next few days, but it will require some adjustments on for the objects stored on the server. We'll follow up with more details

materight commented 1 year ago

Great! Thanks for the update

materight commented 1 year ago

Fixed in v1.12.0