When running autogluon on some benchmarks, at the end of the optimization procedure (unfortunately before rewriting the trajectory) there happens the following error:
@PhMueller: Can we fix this or safely try/except this error since the optimization completed?
[INFO] autogluon.core.searcher.bayesopt.tuning_algorithms.bo_algorithm at 2021-03-21 16:57:35,315 --- BO Algorithm: Selecting final set of candidates.
Exception ignored in: <function Bookkeeper.__del__ at 0x7fb3a8c30dd0>
Traceback (most recent call last):
File "/home/eggenspk/2020_Hpolib2/HPOBenchExperimentUtils/HPOBenchExperimentUtils/core/bookkeeper.py", line 328, in __del__
shutil.rmtree(self.lock_dir)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/shutil.py", line 494, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/shutil.py", line 436, in _rmtree_safe_fd
onerror(os.rmdir, fullname, sys.exc_info())
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/shutil.py", line 434, in _rmtree_safe_fd
os.rmdir(entry.name, dir_fd=topfd)
OSError: [Errno 39] Directory not empty: 'attribute_lock'
Exception ignored in: <function Bookkeeper.__del__ at 0x7fb3a8c30dd0>
Traceback (most recent call last):
File "/home/eggenspk/2020_Hpolib2/HPOBenchExperimentUtils/HPOBenchExperimentUtils/core/bookkeeper.py", line 328, in __del__
shutil.rmtree(self.lock_dir)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/shutil.py", line 498, in rmtree
onerror(os.rmdir, path, sys.exc_info())
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/shutil.py", line 496, in rmtree
os.rmdir(path)
OSError: [Errno 39] Directory not empty: '/home/eggenspk/2020_Hpolib2/HPOBenchExperimentUtils/exp_outputs/NASBench1shot1SearchSpace1Benchmark/autogluon/run-1/lock_dir'
[ERROR] autogluon.core.scheduler.hyperband at 2021-03-21 16:57:58,288 --- Traceback (most recent call last):
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/multiprocessing/managers.py", line 811, in _callmethod
conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/site-packages/autogluon/core/utils/custom_process.py", line 16, in run
mp.Process.run(self)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/site-packages/autogluon/core/scheduler/scheduler.py", line 157, in _worker
ret = fn(**args)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/site-packages/autogluon/core/decorator.py", line 60, in __call__
output = self.f(args, **kwargs)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/site-packages/autogluon/core/decorator.py", line 143, in wrapper_call
return func(*args, **kwargs)
File "/home/eggenspk/2020_Hpolib2/HPOBenchExperimentUtils/HPOBenchExperimentUtils/optimizer/autogluon_optimizer.py", line 150, in objective_function
**self.settings_for_sending)
File "/home/eggenspk/2020_Hpolib2/HPOBenchExperimentUtils/HPOBenchExperimentUtils/core/bookkeeper.py", line 40, in wrapped
self.increase_total_tae_used(1)
File "/home/eggenspk/2020_Hpolib2/HPOBenchExperimentUtils/HPOBenchExperimentUtils/core/bookkeeper.py", line 290, in increase_total_tae_used
self.total_tae_calls_proxy.value = self.total_tae_calls_proxy.value + total_tae_used
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/multiprocessing/managers.py", line 1138, in get
return self._callmethod('get')
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/multiprocessing/managers.py", line 815, in _callmethod
self._connect()
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/multiprocessing/managers.py", line 802, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/multiprocessing/connection.py", line 492, in Client
c = SocketClient(address)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/multiprocessing/connection.py", line 620, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
NoneType: None
Traceback (most recent call last):
File ".//HPOBenchExperimentUtils/run_benchmark.py", line 195, in <module>
run_benchmark(**vars(args), **benchmark_params)
File ".//HPOBenchExperimentUtils/run_benchmark.py", line 157, in run_benchmark
and not tae_exceeds_limit(benchmark.get_total_tae_used(), settings['tae_limit']) \
File "/home/eggenspk/2020_Hpolib2/HPOBenchExperimentUtils/HPOBenchExperimentUtils/core/bookkeeper.py", line 251, in get_total_tae_used
with lock:
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/contextlib.py", line 112, in __enter__
return next(self.gen)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/site-packages/oslo_concurrency/lockutils.py", line 270, in lock
ext_lock.acquire(delay=delay)
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/site-packages/fasteners/process_lock.py", line 156, in acquire
self._do_open()
File "/home/eggenspk/miniconda3CLUSTER/envs/hpobench_37/lib/python3.7/site-packages/fasteners/process_lock.py", line 128, in _do_open
self.lockfile = open(self.path, 'a')
FileNotFoundError: [Errno 2] No such file or directory: b'/home/eggenspk/2020_Hpolib2/HPOBenchExperimentUtils/exp_outputs/NASBench1shot1SearchSpace1Benchmark/autogluon/run-1/lock_dir/attribute_lock/attrib
ute_lock'
When running autogluon on some benchmarks, at the end of the optimization procedure (unfortunately before rewriting the trajectory) there happens the following error:
See also the complete log here: run_NAS1SHOT1_autogluon_32_errlog.txt run_NAS1SHOT1_autogluon_32.cmd_out.txt
@PhMueller: Can we fix this or safely try/except this error since the optimization completed?