allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.71k stars 657 forks source link

Dataset uploading error #1261

Open dmg-ai opened 6 months ago

dmg-ai commented 6 months ago

Describe the bug

Trying to upload a new version of dataset (the first version has already been downloaded), but getting the error (never happened before):

2024-05-06 14:21:37,529 - clearml.Task - ERROR - Action failed <400/801: projects.create/v1.0 (Value combination already exists (unique field already contains this value): name=new_project/ocr/.datasets/aocr_dataset_v2, company=***)> (name=/new_project/ocr/.datasets/aocr_dataset_v2, description=)
2024-05-06 14:21:37,622 - clearml.Task - ERROR - Action failed <400/801: projects.create/v1.0 (Value combination already exists (unique field already contains this value): name=new_project/ocr/.datasets/aocr_dataset_v2, company=***)> (name=/new_project/ocr/.datasets/aocr_dataset_v2, description=)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/clearml/backend_interface/util.py", line 89, in get_or_create_project
    return _get_or_create_project(session, project_name, description=description, system_tags=system_tags, project_id=project_id)
  File "/usr/local/lib/python3.8/site-packages/clearml/backend_interface/util.py", line 130, in _get_or_create_project
    res = session.send(
  File "/usr/local/lib/python3.8/site-packages/clearml/backend_interface/base.py", line 113, in send
    return self._send(session=self.session, req=req, ignore_errors=ignore_errors, raise_on_errors=raise_on_errors,
  File "/usr/local/lib/python3.8/site-packages/clearml/backend_interface/base.py", line 107, in _send
    raise SendError(res, error_msg)
clearml.backend_interface.session.SendError: Action failed <400/801: projects.create/v1.0 (Value combination already exists (unique field already contains this value): name=new_project/ocr/.datasets/aocr_dataset_v2, company=***)> (name=/new_project/ocr/.datasets/aocr_dataset_v2, description=)

To reproduce

from clearml import Dataset

if __name__ == "__main__":
    dataset = Dataset.create(
        dataset_name="aocr_dataset_v2",
        dataset_project="/new_project/ocr",
        description="The 3/5 parts of the source aocr_dataset_v2 dataset.",)
    dataset.add_files("/project/ocr/ocr_ready_dataset")
    dataset.upload()
    dataset.finalize()

Expected behaviour

Uploaded dataset.

Environment

jkhenning commented 4 months ago

Hi @dmg-ai,

It looks like you're providing the same name to the new dataset, which conflicts with an existing one - try changing the value of the dataset_name parameter