allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.61k stars 651 forks source link

Using execute_remotely in a notebook #1094

Open hanshupe007 opened 1 year ago

hanshupe007 commented 1 year ago

I want to use the execute_remotely feature from a Jupyter notebook. The first error I got was ModuleNotFoundError: No module named 'ipython_genutils' which I could resolve by adding!pip install jupyter.

Now I don't receive any error anymore, but the clearml task got stuck without throwing an error. My assumption is that instead of executing my notebook code, it starts a jupyter notebook kernel in clearml and never finishes.

How can I tell the task to just execute the code of my notebook?

jkhenning commented 1 year ago

@hanshupe007 you'll need to share the entire task log - when you use it that way, the code executed on the remote run is the notebook, and a jupyter notebook kernel should not be started. If you can, please share your entire flow (i.e. what you run, what ClearML SDK calls are you making, and the complete log)

hanshupe007 commented 1 year ago

Attached the log file. The task doesn't fail, but just keeps running until it was manually aborted. The "GPU monitoring failed" message I always get, but it's not a problem.

I found this log entry suspicious, as it looks like it starts the IPython kernel backend.

NOTE: When using the ipython kernel entry point, Ctrl-C will not work.

The code I use in the notebook is simply:

!pip install clearml
!pip install jupyter
from clearml import Task
task = Task.init(project_name='Test', task_name="TEST NOTEBOOK")
task.execute_remotely(queue_name="GPUXXX")
print("HELLO")

task.log