allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.6k stars 651 forks source link

`import clearml` fails on Compute Canada clusters #1057

Open PandaGab opened 1 year ago

PandaGab commented 1 year ago

Describe the bug

Importing clearml fails when running (import clearml).

import clearml

The error is:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/__init__.py", line 5, in <module>
    from .task import Task
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/task.py", line 45, in <module>
    from .backend_interface.metrics import Metrics
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/backend_interface/__init__.py", line 2, in <module>
    from .task import Task
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/__init__.py", line 1, in <module>
    from .task import Task
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 32, in <module>
    from ...binding.artifacts import Artifacts
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/binding/artifacts.py", line 23, in <module>
    from ..backend_interface.metrics.events import UploadEvent
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/backend_interface/metrics/__init__.py", line 2, in <module>
    from .interface import Metrics
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/backend_interface/metrics/interface.py", line 17, in <module>
    from .events import MetricsEventAdapter
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/backend_interface/metrics/events.py", line 23, in <module>
    class MetricsEventAdapter(object):
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/backend_interface/metrics/events.py", line 37, in MetricsEventAdapter
    @attrs(cmp=False, slots=True)
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/utilities/attrs.py", line 10, in __init__
    if Version(attr.__version__) >= Version("19.2"):
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/utilities/version.py", line 116, in __init__
    key = self._cmpkey(
  File "/project/166726223/galec39/test-template/.venv/lib/python3.10/site-packages/clearml/utilities/version.py", line 334, in _cmpkey
    local = local[1]
IndexError: tuple index out of range

I added the following print at line 305 in clearml/utilities/version.py and got the following:

>>> print(f"{epoch=}, {release=}, {pre=}, {post=}, {dev=}, {local=}")
epoch=0, release=(23, 1, 0), pre=None, post=None, dev=None, local=('computecanada',)

It seems (clearml/utilities/version.py) to be looking at the version of the package attrs currently installed on the cluster, which is set to the following version

Package            Version
------------------ -------------------------
attrs              23.1.0+computecanada

Environment

Thank you for your help

AlexandruBurlacu commented 1 year ago

Hey @PandaGab, it seems I can't reproduce this issue. Can you try to run almost the same setup (Linux, Py3.10, ClearML 1.11.1) but on a different machine with attrs installed from PyPI and not some private package index/registry. I have a supposition that the issue might be due to +computecanada suffix, but I'm not sure.

PandaGab commented 1 year ago

Hello, I also think it is coming from there but it should not affect (and crash) the usage of ClearML right?

Sorry but I was not able to try what you mentioned. But it seems the behavior of attrs have changed between two versions 22.2.0 and 23.1.0 (both installed from the private package registry). To be clear, when running pip list, I get 22.2.0+computecanada and 23.1.0+computecanada respectively:

>>> import attr
>>> attr.__version__
'22.2.0'

to

>>> import attr
>>> attr.__version__
<stdin>:1: DeprecationWarning: Accessing attr.__version__ is deprecated and will be removed in a future release. Use importlib.metadata directly to query for attrs's packaging metadata.
'23.1.0+computecanada'

Which could explain why my ClearML setup was working fine before but it stopped working after an update.


EDIT: Downgrading attrs did solve the problem.

python -m pip install -U attrs==22.2.0+computecanada