ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
Looks like server tries to calculate number of complited tasks in _report_completed_status and gets an exception
After that update process stops.
Exception in thread Thread-10 (_report_daemon):
Traceback (most recent call last):
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/site-packages/clearml/automation/optimization.py", line 1878, in _report_completed_status
values = [float(v) for v in col[1:]]
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/site-packages/clearml/automation/optimization.py", line 1878, in
values = [float(v) for v in col[1:]]
TypeError: float() argument must be a string or a real number, not 'list'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/site-packages/clearml/automation/optimization.py", line 1766, in _report_daemon
self._report_completed_status(completed_jobs, cur_completed_jobs, task_logger, title)
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/site-packages/clearml/automation/optimization.py", line 1883, in _report_completed_status
unique_ticks = list(set(ticks))
TypeError: unhashable type: 'list'
To reproduce
I have the following setup.
Server on machine No1
4 Agents on machine No2
an_optimizer = HyperParameterOptimizer(
This is the experiment we want to optimize
base_task_id=template_task_id,
hyper_parameters= hyper_parameters,
objective_metric_title='Summary',
objective_metric_series='train_auc',
objective_metric_sign='max',
max_number_of_concurrent_tasks=4,
optimizer_class=GridSearch,
execution_queue='default',
pool_period_min=0.1,
auto_connect_task=True, # Store optimization arguments and configuration in the Task
save_top_k_tasks_only=5,
always_create_task=True,
)
Describe the bug
Looks like server tries to calculate number of complited tasks in _report_completed_status and gets an exception After that update process stops.
Exception in thread Thread-10 (_report_daemon): Traceback (most recent call last): File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/site-packages/clearml/automation/optimization.py", line 1878, in _report_completed_status values = [float(v) for v in col[1:]] File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/site-packages/clearml/automation/optimization.py", line 1878, in
values = [float(v) for v in col[1:]]
TypeError: float() argument must be a string or a real number, not 'list'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/site-packages/clearml/automation/optimization.py", line 1766, in _report_daemon
self._report_completed_status(completed_jobs, cur_completed_jobs, task_logger, title)
File "/home/mvatkin/projects/ai-doc-analyst/ai_doc_analyst/lib/python3.10/site-packages/clearml/automation/optimization.py", line 1883, in _report_completed_status
unique_ticks = list(set(ticks))
TypeError: unhashable type: 'list'
To reproduce
I have the following setup.
an_optimizer = HyperParameterOptimizer(
This is the experiment we want to optimize
base_task_id=template_task_id, hyper_parameters= hyper_parameters, objective_metric_title='Summary', objective_metric_series='train_auc', objective_metric_sign='max', max_number_of_concurrent_tasks=4, optimizer_class=GridSearch, execution_queue='default', pool_period_min=0.1, auto_connect_task=True, # Store optimization arguments and configuration in the Task save_top_k_tasks_only=5, always_create_task=True, )
Expected behaviour
No exception
Environment