Closed mattiadg closed 4 years ago
The demo crashes with the following error.
Unhandled exception <class 'AssertionError'> in thread <_MainThread(MainThread, started 140030923687680)>, proc 13433. Thread current, main, <_MainThread(MainThread, started 140030923687680)>: (Excluded thread.) That were all threads. EXCEPTION Traceback (most recent call last): File "rnn.py", line 11, in <module> line: main() locals: main = <local> <function main at 0x7f5b717b5048> File "/home/mdigangi/bin/returnn/returnn/__main__.py", line 642, in main line: execute_main_task() locals: execute_main_task = <global> <function execute_main_task at 0x7f5b717abea0> File "/home/mdigangi/bin/returnn/returnn/__main__.py", line 535, in execute_main_task line: tuner.work() locals: tuner = <local> <returnn.tf.hyper_param_tuning.Optimization object at 0x7f5a700f4390> tuner.work = <local> <bound method Optimization.work of <returnn.tf.hyper_param_tuning.Optimization object at 0x7f5a700f4390>> File "/home/mdigangi/bin/returnn/returnn/tf/hyper_param_tuning.py", line 553, in work line: _IndividualTrainer(optim=self, individual=population[0], gpu_ids={0}).run() locals: _IndividualTrainer = <global> <class 'returnn.tf.hyper_param_tuning._IndividualTrainer'> optim = <not found> self = <local> <returnn.tf.hyper_param_tuning.Optimization object at 0x7f5a700f4390> individual = <not found> population = <local> [<returnn.tf.hyper_param_tuning.Individual object at 0x7f5a70119ac8>, <returnn.tf.hyper_param_tuning.Individual object at 0x7f5a701199b0>, <returnn.tf.hyper_param_tuning.Individual object at 0x7f5a70119978>, <returnn.tf.hyper_param_tuning.Individual object at 0x7f5a701195f8>, <returnn.tf.hyper_pa..., len = 30 gpu_ids = <not found> run = <not found> File "/home/mdigangi/bin/returnn/returnn/tf/hyper_param_tuning.py", line 640, in run line: engine.init_train_from_config(config=config, train_data=train_data) locals: engine = <local> <returnn.tf.engine.Engine object at 0x7f5b5f3b74e0> engine.init_train_from_config = <local> <bound method Engine.init_train_from_config of <returnn.tf.engine.Engine object at 0x7f5b5f3b74e0>> config = <local> <returnn.config.Config object at 0x7f5b5f3b1780> train_data = <local> <StaticDataset 'dataset_id140030416483608' epoch=None> File "/home/mdigangi/bin/returnn/returnn/tf/engine.py", line 1036, in init_train_from_config line: self.init_network_from_config(config) locals: self = <local> <returnn.tf.engine.Engine object at 0x7f5b5f3b74e0> self.init_network_from_config = <local> <bound method Engine.init_network_from_config of <returnn.tf.engine.Engine object at 0x7f5b5f3b74e0>> config = <local> <returnn.config.Config object at 0x7f5b5f3b1780> File "/home/mdigangi/bin/returnn/returnn/tf/engine.py", line 1094, in init_network_from_config line: assert self.epoch, "task %r" % config.value("task", "train") locals: self = <local> <returnn.tf.engine.Engine object at 0x7f5b5f3b74e0> self.epoch = <local> None config = <local> <returnn.config.Config object at 0x7f5b5f3b1780> config.value = <local> <bound method Config.value of <returnn.config.Config object at 0x7f5b5f3b1780>> AssertionError: task 'hyper_param_tuning'
I think that the problem is due to the config having task = "hyper_param_tuning" as it can never initialize the epoch variable in init_network_from_config https://github.com/rwth-i6/returnn/blob/021171b7fa97a6b32e9a28d3919b74de1c6d46ef/returnn/tf/engine.py#L1062-L1094
config
task = "hyper_param_tuning"
epoch
init_network_from_config
You could have just posted this error in the PR. There is no need to make this a separate issue. This just makes it more complicated to follow and understand.
The demo crashes with the following error.
I think that the problem is due to the
config
havingtask = "hyper_param_tuning"
as it can never initialize theepoch
variable ininit_network_from_config
https://github.com/rwth-i6/returnn/blob/021171b7fa97a6b32e9a28d3919b74de1c6d46ef/returnn/tf/engine.py#L1062-L1094