Open Vadim2S opened 1 year ago
Hi @Vadim2S,
What is in_cloud
? When you call Task.connect()
, is your code being run by a CleaRML Agent?
Sorry. My bad. I am just copy my code without comment. Here more details:
In_cloud set as True in case of using ClearML i.e always.
My work pace is following:
1) I am try new code: Windows. Code run locally without ClearML Agent, but using ClearML Task.Init Dataset loaded from ClearML server. Config dictionary after "config = task.connect(config)" is mutable.
2) I am clone task from 1), change configs and send it to queue Linux. Code run on remote computer by ClearML Agent. Dataset loaded from ClearML server. Config dictionary after "config = task.connect(config)" is READ-ONLY.
i.e minimal reproduction code something like:
config = unsafe_load(open(args.config, "rb")) # config.yaml loaded here
config = task.connect(config)
#config = config.copy()
#do not sure if Dataset loading is vital for this case
config['val_name'] = 'new_val'
print(config['val_name'])
Expected result is "new_val" output.
In case 1) I am get "new_val" output. In case 2) I am get "old_val" output without any error.
If I am uncomment "config = config.copy()" line I am get "new_val" output in case 2) too.
@Vadim2S this is the intended behavior, and it does not actually depend on the operating system, but on the fact you are running the first flow without an agent (i.e. in what we call "local" or "development" mode) - in that case ClearML SDK is designed to "record" what you do and build a reproducible environment on the server for your task. This includes what you set into that dictionary, even after you connect it to the task. In the second flow, you are running in what we call "remote" mode, i.e. using an agent to execute your task. In that case, the agent is designed to reproduce the environment (including whatever you set into the dictionary in the local run) and make sure that's what your code gets (which is why the dictionary is read only). If you clone the task and change stuff in it before enqueuing, what your code gets in the remote run is whatever was stored in the server, including any changes you made (for example, to the connected configuration)
I am presume (from documentation) what "agent reproduce the environment" executed in this code line "config = task.connect(config)". And it is all. After this I am can change environment as I am wish. Im wrong?
I am Ok if read-only is intended. But. Here arise two new errors or suggestion. As you name it. 1) Behaviour must be same. I.e. read-only always or mutable always. 2) ClearML must throw error on my config modification attempt.
I am trash whole day work running remote model training with wrong dataset because NOTHING says about my config do not changed as it is tested in local-running code.
Describe the bug
After task.connect(dictionary) I am can not change dictionary values. Linux only.
To reproduce
I am use code like this:
Expected behaviour
All cases: config['train_data_dir'] contains path like clearml_cache_dataset_dir/training.csv
Actual behaviour
Linux (Ubuntu 20.04) config['train_data_dir'] contains old unchanged value (error)
Windows 10: config['train_data_dir'] contains new path like expected
Environment
Related Discussion
Very strange Linux-only error. Uncomment 'config = config.copy()' line for workaround. New config instance can be modified as expected.