ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
STATUS MESSAGE:
Forced stop (non-responsive)
STATUS REASON:
Forced stop (non-responsive)
LibriTTS dataset is around 100 gB.
The machine dedicated as clearml server is not the strongest, but I expected it to be good enough:
Intel(R) Xeon(R) CPU E5-1620 v3 @ 3.50GHz 8cores
15 gb RAM
Is there setting to increase time-out or give more resources to the service? Why copying data is such a heavy job that it gets killed?
Environment
Server type: self-hosted
ClearML SDK Version: 1.7.0
ClearML Server Version (Only for self hosted): WebApp: 1.9.2-317 • Server: 1.9.2-317 • API: 2.23
It's probably caused by the compression step: the program is not really hanging but it's slowly processing the large amount of files. I have a similar problem. Could we solve by disabling compression?
Describe the bug
I am trying to upload the dataset to self-hosted clearml server:
On the client commands hang. In the web interface, the dataset creation is marked as "Aborted". In the console the last messages that I see:
And under "Info" tab, I see:
LibriTTS dataset is around 100 gB. The machine dedicated as clearml server is not the strongest, but I expected it to be good enough: Intel(R) Xeon(R) CPU E5-1620 v3 @ 3.50GHz 8cores 15 gb RAM
Is there setting to increase time-out or give more resources to the service? Why copying data is such a heavy job that it gets killed?
Environment