An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Describe the issue:
I’m trying to learn how to implement NAS using NNI. However, I'm getting the ‘ImportError: Cannot use a path to identify something from main.’ and ‘TypeError: cannot pickle 'CudnnModule' object’ errors listed below.
log:
[2024-04-26 15:45:28] Config is not provided. Will try to infer. [2024-04-26 15:45:28] Using execution engine based on training service. Trial concurrency is set to 1. [2024-04-26 15:45:28] Using simplified model format. [2024-04-26 15:45:28] Using local training service. [2024-04-26 15:45:28] WARNING: GPU found but will not be used. Please setexperiment.config.trial_gpu_number` to the number of GPUs you want to use for each trial.
[2024-04-26 15:45:30] Creating experiment, Experiment ID: lyjc7okv
[2024-04-26 15:45:30] Starting web server...
[2024-04-26 15:45:30] Setting up...
[2024-04-26 15:45:30] Web portal URLs: http://172.22.9.46:8081http://127.0.0.1:8081
[2024-04-26 15:45:30] Successfully update searchSpace.
[2024-04-26 15:45:30] Checkpoint saved to C:\Users\Lab-d\nni-experiments\lyjc7okv\checkpoint.
[2024-04-26 15:45:30] Experiment initialized successfully. Starting exploration strategy...
[2024-04-26 15:45:30] ERROR: Strategy failed to execute.
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 831, in get_hybrid_cls_or_func_name
name = _get_cls_or_func_name(cls_or_func)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 810, in _get_cls_or_func_name
raise ImportError('Cannot use a path to identify something from main.')
ImportError: Cannot use a path to identify something from main.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 103, in
exp.run(port=8081)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\experiment\experiment.py", line 236, in run
return self._run_impl(port, wait_completion, debug)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\experiment\experiment.py", line 205, in _run_impl
self.start(port, debug)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\experiment\experiment.py", line 270, in start
self._start_engine_and_strategy()
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\experiment\experiment.py", line 230, in _start_engine_and_strategy
self.strategy.run()
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\strategy\base.py", line 170, in run
self._run()
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\strategy\bruteforce.py", line 223, in _run
self.engine.submit_models(model)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\execution\training_service.py", line 172, in submit_models
self._channel.send_trial(
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\runtime\tuner_command_channel\channel.py", line 144, in send_trial
send_payload = dump(trial_dict, pickle_size_limit=int(os.getenv('PICKLE_SIZE_LIMIT', 64 * 1024)))
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 372, in dump
result = _dump(
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 424, in _dump
return json_tricks.dumps(obj, obj_encoders=encoders, *json_tricks_kwargs)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\json_tricks\nonp.py", line 125, in dumps
txt = combined_encoder.encode(obj)
File "C:\ProgramData\Anaconda3\envs\proje\lib\json\encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "C:\ProgramData\Anaconda3\envs\proje\lib\json\encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\json_tricks\encoders.py", line 76, in default
obj = encoder(obj, primitives=self.primitives, is_changed=id(obj) != prev_id, properties=self.properties)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\json_tricks\utils.py", line 66, in wrapper
return encoder(args, **{k: v for k, v in kwargs.items() if k in names})
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 858, in _json_tricks_func_or_cls_encode
'__nni_type__': get_hybrid_cls_or_func_name(cls_or_func, pickle_size_limit)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 835, in get_hybrid_cls_or_func_name
b = cloudpickle.dumps(cls_or_func)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\cloudpickle\cloudpickle.py", line 1479, in dumps
cp.dump(obj)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\cloudpickle\cloudpickle.py", line 1245, in dump
return super().dump(obj)
TypeError: cannot pickle 'CudnnModule' object
[2024-04-26 15:45:30] Stopping experiment, please wait...
[2024-04-26 15:45:30] Checkpoint saved to C:\Users\Lab-d\nni-experiments\lyjc7okv\checkpoint.
[2024-04-26 15:45:30] Experiment stopped`
Describe the issue: I’m trying to learn how to implement NAS using NNI. However, I'm getting the ‘ImportError: Cannot use a path to identify something from main.’ and ‘TypeError: cannot pickle 'CudnnModule' object’ errors listed below.
my code: https://github.com/ktunlab/nas-resnet-demo
Environment:
Configuration:
Log message:
log:
[2024-04-26 15:45:28] Config is not provided. Will try to infer. [2024-04-26 15:45:28] Using execution engine based on training service. Trial concurrency is set to 1. [2024-04-26 15:45:28] Using simplified model format. [2024-04-26 15:45:28] Using local training service. [2024-04-26 15:45:28] WARNING: GPU found but will not be used. Please set
experiment.config.trial_gpu_number` to the number of GPUs you want to use for each trial. [2024-04-26 15:45:30] Creating experiment, Experiment ID: lyjc7okv [2024-04-26 15:45:30] Starting web server... [2024-04-26 15:45:30] Setting up... [2024-04-26 15:45:30] Web portal URLs: http://172.22.9.46:8081 http://127.0.0.1:8081 [2024-04-26 15:45:30] Successfully update searchSpace. [2024-04-26 15:45:30] Checkpoint saved to C:\Users\Lab-d\nni-experiments\lyjc7okv\checkpoint. [2024-04-26 15:45:30] Experiment initialized successfully. Starting exploration strategy... [2024-04-26 15:45:30] ERROR: Strategy failed to execute. Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 831, in get_hybrid_cls_or_func_name name = _get_cls_or_func_name(cls_or_func) File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 810, in _get_cls_or_func_name raise ImportError('Cannot use a path to identify something from main.') ImportError: Cannot use a path to identify something from main.During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "train.py", line 103, in
exp.run(port=8081)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\experiment\experiment.py", line 236, in run
return self._run_impl(port, wait_completion, debug)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\experiment\experiment.py", line 205, in _run_impl
self.start(port, debug)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\experiment\experiment.py", line 270, in start
self._start_engine_and_strategy()
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\experiment\experiment.py", line 230, in _start_engine_and_strategy
self.strategy.run()
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\strategy\base.py", line 170, in run
self._run()
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\strategy\bruteforce.py", line 223, in _run
self.engine.submit_models(model)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\nas\execution\training_service.py", line 172, in submit_models
self._channel.send_trial(
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\runtime\tuner_command_channel\channel.py", line 144, in send_trial
send_payload = dump(trial_dict, pickle_size_limit=int(os.getenv('PICKLE_SIZE_LIMIT', 64 * 1024)))
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 372, in dump
result = _dump(
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 424, in _dump
return json_tricks.dumps(obj, obj_encoders=encoders, *json_tricks_kwargs)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\json_tricks\nonp.py", line 125, in dumps
txt = combined_encoder.encode(obj)
File "C:\ProgramData\Anaconda3\envs\proje\lib\json\encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "C:\ProgramData\Anaconda3\envs\proje\lib\json\encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\json_tricks\encoders.py", line 76, in default
obj = encoder(obj, primitives=self.primitives, is_changed=id(obj) != prev_id, properties=self.properties)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\json_tricks\utils.py", line 66, in wrapper
return encoder(args, **{k: v for k, v in kwargs.items() if k in names})
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 858, in _json_tricks_func_or_cls_encode
'__nni_type__': get_hybrid_cls_or_func_name(cls_or_func, pickle_size_limit)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\nni\common\serializer.py", line 835, in get_hybrid_cls_or_func_name
b = cloudpickle.dumps(cls_or_func)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\cloudpickle\cloudpickle.py", line 1479, in dumps
cp.dump(obj)
File "C:\ProgramData\Anaconda3\envs\proje\lib\site-packages\cloudpickle\cloudpickle.py", line 1245, in dump
return super().dump(obj)
TypeError: cannot pickle 'CudnnModule' object
[2024-04-26 15:45:30] Stopping experiment, please wait...
[2024-04-26 15:45:30] Checkpoint saved to C:\Users\Lab-d\nni-experiments\lyjc7okv\checkpoint.
[2024-04-26 15:45:30] Experiment stopped`
How to reproduce it?: python train.py