Closed aleksglushko closed 3 years ago
... i can't manage initialization of layers in a such case ...
Why? What is the problem? Please post the full error including stack trace and log (with debug_print_layer_output_template
enabled).
This is the log file with backtrace and enabled debug_print_layer_output_template
.
backtrace_with_shapes.log
As i understand, the problem is that initializer somehow don't connect reuse_layer with it's parameters in the case when we have more than one'output'
layer with loss. But i might be also wrong. Because as I see the ReuseParams object initialisation seems to be correct, since it has both the reuse_layer and map.
Where?
Is it readable? Or should i just copy paste the log output here?
It's readable, but next time, please copy the relevant parts directly here.
For reference, here:
RETURNN starting up, version 1.20210701.004738+git.cb87521, date/time 2021-07-01-23-25-20 (UTC+0200), pid 2876, cwd /work/asr3/zeineldeen/hiwis/glushko/setups-data/switchboard/2021-06-21--ilmt-att-sis/work/crnn/training/CRNNTrainingJob.ACreMrgOrckx/work, Python /work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/bin/python3
RETURNN command line options: ['/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/work/crnn/training/CRNNTrainingJob.ACreMrgOrckx/output/crnn.config']
Hostname: cluster-cn-253
2021-07-01 23:25:22.931917: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
[2021-07-01 23:25:24,437] INFO: Run time: 0:00:05 CPU: 0.40% RSS: 258MB VMS: 2.18GB
TensorFlow: 2.3.0 (v2.3.0-2-gee598066c4) (<site-package> in /work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow)
Use num_threads=1 (but min 2) via OMP_NUM_THREADS.
Setup TF inter and intra global thread pools, num_threads 2, session opts {'log_device_placement': False, 'device_count': {'GPU': 0}, 'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2}.
2021-07-01 23:25:28.807619: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-07-01 23:25:28.837037: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2099815000 Hz
2021-07-01 23:25:28.837557: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x461d890 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-07-01 23:25:28.837596: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-07-01 23:25:28.842414: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
[2021-07-01 23:25:29,456] INFO: Run time: 0:00:10 CPU: 0.40% RSS: 347MB VMS: 2.99GB
2021-07-01 23:25:29.531428: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-01 23:25:29.531479: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]
CUDA_VISIBLE_DEVICES is set to '0'.
Collecting TensorFlow device list...
2021-07-01 23:25:29.739672: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x46bbb40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-07-01 23:25:29.739779: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1080 Ti, Compute Capability 6.1
2021-07-01 23:25:29.745034: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.582GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2021-07-01 23:25:29.745150: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-07-01 23:25:29.749937: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-07-01 23:25:29.753337: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-07-01 23:25:29.754998: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-07-01 23:25:29.761654: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-07-01 23:25:29.788539: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-07-01 23:25:29.819354: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-07-01 23:25:29.832494: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-07-01 23:25:29.832626: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
[2021-07-01 23:25:34,474] INFO: Run time: 0:00:15 CPU: 0.20% RSS: 705MB VMS: 17.18GB
2021-07-01 23:25:37.085265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-01 23:25:37.085323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2021-07-01 23:25:37.085332: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2021-07-01 23:25:37.088809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/device:GPU:0 with 10266 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
Local devices available to TensorFlow:
1/4: name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 17821591032498612287
2/4: name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 7550349090206536510
physical_device_desc: "device: XLA_CPU device"
3/4: name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 823384536220778428
physical_device_desc: "device: XLA_GPU device"
4/4: name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 10764901440
locality {
bus_id: 1
links {
}
}
incarnation: 17495936249025657957
physical_device_desc: "device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1"
Using gpu device 0: GeForce GTX 1080 Ti
<ExternSprintDataset 'dev' epoch=None>: epoch None exec ['/u/zhou/rasr-dev/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard', '--config=/u/zeineldeen/setups/switchboard/2020-07-16--phon-att-sis/dependencies/rasr_configs/training.config', '--*.corpus.file=`cf /work/asr3/irie/data/switchboard/corpora/train.corpus.gz`', '--*.corpus.segments.file=`cf /u/zeineldeen/setups/switchboard/2020-01-21--att-phon/dependencies/seg_cv_head3000`', '--*.corpus.segment-order-shuffle=true', '--*.segment-order-sort-by-time-length=true', '--*.segment-order-sort-by-time-length-chunk-size=-1', '--*.feature-cache-path=`cf /u/tuske/work/ASR/switchboard/feature.extraction/gt40_40/data/gt.train.bundle`', '--*.log-channel.file=cv.sprint.log', '--*.window-size=1', '--*.seed=0', '--*.python-segment-order=true', '--*.python-segment-order-pymod-path=/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn', '--*.python-segment-order-pymod-name=returnn.sprint.extern_interface', '--*.use-data-source=false', '--*.trainer=python-trainer', '--*.pymod-path=/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn', '--*.pymod-name=returnn.sprint.extern_interface', '--*.pymod-config=action:ExternSprintDataset,c2p_fd:25,p2c_fd:26']
...
<ExternSprintDataset 'dev' epoch=None>: interrupt child proc 3526
<ExternSprintDataset 'dev' epoch=1>: epoch 1 exec ['/u/zhou/rasr-dev/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard', '--config=/u/zeineldeen/setups/switchboard/2020-07-16--phon-att-sis/dependencies/rasr_configs/training.config', '--*.corpus.file=`cf /work/asr3/irie/data/switchboard/corpora/train.corpus.gz`', '--*.corpus.segments.file=`cf /u/zeineldeen/setups/switchboard/2020-01-21--att-phon/dependencies/seg_cv_head3000`', '--*.corpus.segment-order-shuffle=true', '--*.segment-order-sort-by-time-length=true', '--*.segment-order-sort-by-time-length-chunk-size=-1', '--*.feature-cache-path=`cf /u/tuske/work/ASR/switchboard/feature.extraction/gt40_40/data/gt.train.bundle`', '--*.log-channel.file=cv.sprint.log', '--*.window-size=1', '--*.seed=0', '--*.python-segment-order=true', '--*.python-segment-order-pymod-path=/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn', '--*.python-segment-order-pymod-name=returnn.sprint.extern_interface', '--*.use-data-source=false', '--*.trainer=python-trainer', '--*.pymod-path=/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn', '--*.pymod-name=returnn.sprint.extern_interface', '--*.pymod-config=action:ExternSprintDataset,c2p_fd:25,p2c_fd:26']
<ExternSprintDataset 'train' epoch=None>: epoch None exec ['/u/zhou/rasr-dev/arch/linux-x86_64-standard/nn-trainer.linux-x86_64-standard', '--config=/u/zeineldeen/setups/switchboard/2020-07-16--phon-att-sis/dependencies/rasr_configs/training.config', '--*.corpus.file=`cf /work/asr3/irie/data/switchboard/corpora/train.corpus.gz`', '--*.corpus.segments.file=`cf /u/tuske/work/ASR/switchboard/corpus/train.segments`', '--*.corpus.segment-order-shuffle=true', '--*.segment-order-sort-by-time-length=true', '--*.segment-order-sort-by-time-length-chunk-size=6000', '--*.feature-cache-path=`cf /u/tuske/work/ASR/switchboard/feature.extraction/gt40_40/data/gt.train.bundle`', '--*.log-channel.file=train.sprint.log', '--*.window-size=1', '--*.seed=0', '--*.corpus.partition=6', '--*.corpus.select-partition=0', '--*.python-segment-order=true', '--*.python-segment-order-pymod-path=/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn', '--*.python-segment-order-pymod-name=returnn.sprint.extern_interface', '--*.use-data-source=false', '--*.trainer=python-trainer', '--*.pymod-path=/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn', '--*.pymod-name=returnn.sprint.extern_interface', '--*.pymod-config=action:ExternSprintDataset,c2p_fd:26,p2c_fd:28']
...
Learning-rate-control: file learning_rates does not exist yet
Update config key 'batch_size' for epoch 1: 10000 -> 15000
Setup TF session with options {'log_device_placement': False, 'device_count': {'GPU': 1}} ...
2021-07-01 23:27:06.373029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.582GHz coreCount: 28 deviceMemorySize: 10.92GiB deviceMemoryBandwidth: 451.17GiB/s
2021-07-01 23:27:06.373115: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-07-01 23:27:06.373156: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-07-01 23:27:06.373175: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-07-01 23:27:06.373192: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-07-01 23:27:06.373209: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-07-01 23:27:06.373225: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-07-01 23:27:06.373242: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-07-01 23:27:06.377129: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-07-01 23:27:06.377177: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-01 23:27:06.377186: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2021-07-01 23:27:06.377192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2021-07-01 23:27:06.381401: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10266 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
WARNING:tensorflow:From /u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py:435: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/util/basic.py:1285: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
layer root/'data' output: Data(name='data', batch_shape_meta=[B,T|'time:var:extern_data:data',F|40])
layer root/'source' output: Data(name='source_output', batch_shape_meta=[B,T|'time:var:extern_data:data',F|40])
layer root/'source0' output: Data(name='source0_output', batch_shape_meta=[B,T|'time:var:extern_data:data',40,F|1])
layer root/'conv0' output: Data(name='conv0_output', batch_shape_meta=[B,T|'time:var:extern_data:data',40,F|32])
layer root/'conv0p' output: Data(name='conv0p_output', batch_shape_meta=[B,T|?,20,F|32])
layer root/'conv1' output: Data(name='conv1_output', batch_shape_meta=[B,T|'time:var:extern_data:data',20,F|32])
layer root/'conv1p' output: Data(name='conv1p_output', batch_shape_meta=[B,T|?,10,F|32])
layer root/'conv_merged' output: Data(name='conv_merged_output', batch_shape_meta=[B,T|'time:var:extern_data:data',F|320])
layer root/'lstm0_fw' output: Data(name='lstm0_fw_output', batch_shape_meta=[T|'time:var:extern_data:data',B,F|512])
<ExternSprintDataset 'devtrain' epoch=1> add_new_data: seq=200, len=128. Cache filled, waiting to get loaded...
OpCodeCompiler call: /usr/local/cuda-10.1/bin/nvcc -shared -O2 -std=c++11 -I /work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/include -I /work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/include/external/nsync/public -ccbin /usr/bin/gcc-5 -I /usr/local/cuda-10.1/include -L /usr/local/cuda-10.1/lib64 -x cu -v -DGOOGLE_CUDA=1 -Xcompiler -fPIC -Xcompiler -v -arch compute_61 -D_GLIBCXX_USE_CXX11_ABI=1 -DNDEBUG=1 -g /var/tmp/3426359.1.4-GPU-1080/glushko/returnn_tf_cache/ops/NativeLstm2/dbd9d53df5/NativeLstm2.cc -o /var/tmp/3426359.1.4-GPU-1080/glushko/returnn_tf_cache/ops/NativeLstm2/dbd9d53df5/NativeLstm2.so -L/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/numpy.libs -l:libopenblasp-r0-34a18dc3.3.7.so -L/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow -l:libtensorflow_framework.so.2
[2021-07-01 23:27:25,116] INFO: Run time: 0:02:05 CPU: 0.20% RSS: 3.12GB VMS: 21.66GB
[2021-07-01 23:27:30,137] INFO: Run time: 0:02:10 CPU: 0.60% RSS: 2.55GB VMS: 21.07GB
[2021-07-01 23:27:35,160] INFO: Run time: 0:02:15 CPU: 0.40% RSS: 3.11GB VMS: 21.63GB
OpCodeCompiler call: /usr/local/cuda-10.1/bin/nvcc -shared -O2 -std=c++11 -I /work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/include -I /work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/include/external/nsync/public -ccbin /usr/bin/gcc-5 -I /usr/local/cuda-10.1/include -L /usr/local/cuda-10.1/lib64 -x cu -v -DGOOGLE_CUDA=1 -Xcompiler -fPIC -Xcompiler -v -arch compute_61 -D_GLIBCXX_USE_CXX11_ABI=1 -DNDEBUG=1 -g /var/tmp/3426359.1.4-GPU-1080/glushko/returnn_tf_cache/ops/GradOfNativeLstm2/e1228a5e61/GradOfNativeLstm2.cc -o /var/tmp/3426359.1.4-GPU-1080/glushko/returnn_tf_cache/ops/GradOfNativeLstm2/e1228a5e61/GradOfNativeLstm2.so -L/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/numpy.libs -l:libopenblasp-r0-34a18dc3.3.7.so -L/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow -l:libtensorflow_framework.so.2
[2021-07-01 23:27:55,231] INFO: Run time: 0:02:35 CPU: 0.20% RSS: 2.46GB VMS: 21.04GB
[2021-07-01 23:28:00,251] INFO: Run time: 0:02:40 CPU: 0.40% RSS: 3.06GB VMS: 21.58GB
[2021-07-01 23:28:10,287] INFO: Run time: 0:02:50 CPU: 0.40% RSS: 2.47GB VMS: 21.01GB
layer root/'lstm0_bw' output: Data(name='lstm0_bw_output', batch_shape_meta=[T|'time:var:extern_data:data',B,F|512])
layer root/'lstm0_pool' output: Data(name='lstm0_pool_output', batch_shape_meta=[B,T|?,F|1024])
layer root/'lstm1_fw' output: Data(name='lstm1_fw_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|512])
layer root/'lstm1_bw' output: Data(name='lstm1_bw_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|512])
layer root/'encoder' output: Data(name='encoder_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/'data:bpe' output: Data(name='bpe', dtype='int32', sparse=True, dim=534, available_for_inference=False, batch_shape_meta=[B,T|'time:var:extern_data:bpe'])
layer root/'ctc' output: Data(name='ctc_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|535])
layer root/'enc_value' output: Data(name='enc_value_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,1,F|1024])
layer root/'enc_ctx' output: Data(name='enc_ctx_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/'inv_fertility' output: Data(name='inv_fertility_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1])
layer root/'output' output: Data(name='output_output', dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])
Rec layer 'output' (search False, train 'globals/train_flag:0') sub net:
Input layers moved out of loop: (#: 4)
output
target_embed
prev_1_target_embed
prev_2_target_embed
Output layers moved out of loop: (#: 8)
output_prob
readout
readout_in
iLMT_0_output_prob
iLMT_0_readout
iLMT_0_readout_in
iLMT_0_FF_0
zero_att
Layers in loop: (#: 9)
FF_0
att
att0
att_weights
energy
energy_tanh
energy_in
weight_feedback
accum_att_weights
Unused layers: (#: 1)
end
WARNING:tensorflow:From /u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/util/basic.py:5317: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
layer root/output(rec-subnet-input)/'data:bpe' output: Data(name='bpe', dtype='int32', sparse=True, dim=534, available_for_inference=False, batch_shape_meta=[B,T|'time:var:extern_data:bpe'])
layer root/output(rec-subnet-input)/'output' output: Data(name='output_output', dtype='int32', sparse=True, dim=534, batch_shape_meta=[B,T|'time:var:extern_data:bpe'])
layer root/output(rec-subnet-input)/'target_embed' output: Data(name='target_embed_output', batch_shape_meta=[B,T|'time:var:extern_data:bpe',F|621])
layer root/output(rec-subnet-input)/'prev:target_embed' output: Data(name='target_embed_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])
layer root/output(rec-subnet-input)/'prev_1_target_embed' output: Data(name='prev_1_target_embed_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])
layer root/output(rec-subnet-input)/'prev:prev_1_target_embed' output: Data(name='prev_1_target_embed_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])
layer root/output(rec-subnet-input)/'prev_2_target_embed' output: Data(name='prev_2_target_embed_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])
layer root/output(rec-subnet)/'prev:target_embed' output: Data(name='target_embed_output', batch_shape_meta=[B,F|621])
layer root/output(rec-subnet)/'prev:prev_1_target_embed' output: Data(name='prev_1_target_embed_output', batch_shape_meta=[B,F|621])
layer root/output(rec-subnet)/'prev:prev_2_target_embed' output: Data(name='prev_2_target_embed_output', batch_shape_meta=[B,F|621])
layer root/output(rec-subnet)/'FF_0' output: Data(name='FF_0_output', batch_shape_meta=[B,F|1024])
layer root/output(rec-subnet)/'weight_feedback' output: Data(name='weight_feedback_output', batch_shape_meta=[T|?,B,F|1024])
layer root/output(rec-subnet)/'energy_in' output: Data(name='energy_in_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/output(rec-subnet)/'energy_tanh' output: Data(name='energy_tanh_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024])
layer root/output(rec-subnet)/'energy' output: Data(name='energy_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1])
layer root/output(rec-subnet)/'att_weights' output: Data(name='att_weights_output', batch_shape_meta=[B,F|1,T|'spatial:0:lstm0_pool'])
layer root/output(rec-subnet)/'att0' output: Data(name='att0_output', batch_shape_meta=[B,1,F|1024])
layer root/output(rec-subnet)/'att' output: Data(name='att_output', batch_shape_meta=[B,F|1024])
layer root/output(rec-subnet)/'accum_att_weights' output: Data(name='accum_att_weights_output', batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1])
layer root/output(rec-subnet-output)/'FF_0' output: Data(name='FF_0_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])
layer root/output(rec-subnet-output)/'prev:target_embed' output: Data(name='target_embed_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])
layer root/output(rec-subnet-output)/'att' output: Data(name='att_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])
layer root/output(rec-subnet-output)/'readout_in' output: Data(name='readout_in_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1000])
layer root/output(rec-subnet-output)/'readout' output: Data(name='readout_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|500])
layer root/output(rec-subnet-output)/'data:bpe' output: Data(name='bpe', dtype='int32', sparse=True, dim=534, available_for_inference=False, batch_shape_meta=[B,T|'time:var:extern_data:bpe'])
layer root/output(rec-subnet-output)/'output_prob' output: Data(name='output_prob_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|534])
layer root/output(rec-subnet-output)/'prev:prev_1_target_embed' output: Data(name='prev_1_target_embed_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])
layer root/output(rec-subnet-output)/'prev:prev_2_target_embed' output: Data(name='prev_2_target_embed_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])
layer root/output(rec-subnet-output)/'zero_att' output: Data(name='zero_att_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])
layer root/output(rec-subnet-output)/'iLMT_0_FF_0' output: Data(name='iLMT_0_FF_0_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])
Exception creating layer root/output(rec-subnet-output)/'iLMT_0_FF_0' of class LinearLayer with opts:
{'L2': 0.0005,
'_name': 'iLMT_0_FF_0',
'_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'activation': 'tanh',
'n_out': 1024,
'name': 'iLMT_0_FF_0',
'network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'output': Data(name='iLMT_0_FF_0_output', batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024]),
'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>}>,
'sources': [<InternalLayer output/'prev:target_embed' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])>,
<InternalLayer output/'prev:prev_1_target_embed' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])>,
<InternalLayer output/'prev:prev_2_target_embed' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|621])>,
<EvalLayer output/'zero_att' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>],
'with_bias': True}
Exception occurred during output-net construction of layer 'iLMT_0_FF_0'.
We had previous exceptions at template construction, which got resolved, but maybe sth is wrong.
Template network (check out types / shapes):
{'FF_0': <_TemplateLayer(LinearLayer)(:template:linear) output/'FF_0' out_type=Data(batch_shape_meta=[B?,F|1024]) (construction stack 'energy_in')>,
'accum_att_weights': <_TemplateLayer(EvalLayer)(:template:eval) output/'accum_att_weights' out_type=Data(batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1]) (construction stack 'weight_feedback')>,
'att': <_TemplateLayer(MergeDimsLayer)(:template:merge_dims) output/'att' out_type=Data(batch_shape_meta=[B?,F|1024]) (construction stack 'zero_att')>,
'att0': <_TemplateLayer(GenericAttentionLayer)(:template:generic_attention) output/'att0' out_type=Data(batch_shape_meta=[B?,1,F|1024]) (construction stack 'att')>,
'att_weights': <_TemplateLayer(SoftmaxOverSpatialLayer)(:template:softmax_over_spatial) output/'att_weights' out_type=Data(batch_shape_meta=[B?,F|1,T|'spatial:0:lstm0_pool']) (construction stack 'att0')>,
'data:bpe': <_TemplateLayer(SourceLayer)(:template:source) output/'data:bpe' out_type=Data(dtype='int32', sparse=True, dim=534, available_for_inference=False, batch_shape_meta=[B]) (construction stack 'output')>,
'end': <_TemplateLayer(CompareLayer)(:template:compare) output/'end' out_type=Data(dtype='bool', sparse=True, dim=2, batch_shape_meta=[B]) (construction stack None)>,
'energy': <_TemplateLayer(LinearLayer)(:template:linear) output/'energy' out_type=Data(batch_shape_meta=[T|'spatial:0:lstm0_pool',B?,F|1]) (construction stack 'att_weights')>,
'energy_in': <_TemplateLayer(CombineLayer)(:template:combine) output/'energy_in' out_type=Data(batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) (construction stack 'energy_tanh')>,
'energy_tanh': <_TemplateLayer(ActivationLayer)(:template:activation) output/'energy_tanh' out_type=Data(batch_shape_meta=[T|'spatial:0:lstm0_pool',B,F|1024]) (construction stack 'energy')>,
'iLMT_0_FF_0': <_TemplateLayer(LinearLayer)(:template:linear) output/'iLMT_0_FF_0' out_type=Data(batch_shape_meta=[B?,F|1024]) (construction stack 'iLMT_0_readout_in')>,
'iLMT_0_output_prob': <_TemplateLayer(SoftmaxLayer)(:template:softmax) output/'iLMT_0_output_prob' out_type=Data(batch_shape_meta=[B?,F|534]) (construction stack None)>,
'iLMT_0_readout': <_TemplateLayer(ReduceOutLayer)(:template:reduce_out) output/'iLMT_0_readout' out_type=Data(batch_shape_meta=[B?,F|500]) (construction stack 'iLMT_0_output_prob')>,
'iLMT_0_readout_in': <_TemplateLayer(LinearLayer)(:template:linear) output/'iLMT_0_readout_in' out_type=Data(batch_shape_meta=[B?,F|1000]) (construction stack 'iLMT_0_readout')>,
'output': <_TemplateLayer(ChoiceLayer)(:template:choice) output/'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[B]) (construction stack None)>,
'output_prob': <_TemplateLayer(SoftmaxLayer)(:template:softmax) output/'output_prob' out_type=Data(batch_shape_meta=[B?,F|534]) (construction stack None)>,
'prev_1_target_embed': <_TemplateLayer(CopyLayer)(:template:copy) output/'prev_1_target_embed' out_type=Data(batch_shape_meta=[B?,F|621]) (construction stack 'iLMT_0_FF_0')>,
'prev_2_target_embed': <_TemplateLayer(CopyLayer)(:template:copy) output/'prev_2_target_embed' out_type=Data(batch_shape_meta=[B?,F|621]) (construction stack 'iLMT_0_FF_0')>,
'readout': <_TemplateLayer(ReduceOutLayer)(:template:reduce_out) output/'readout' out_type=Data(batch_shape_meta=[B?,F|500]) (construction stack 'output_prob')>,
'readout_in': <_TemplateLayer(LinearLayer)(:template:linear) output/'readout_in' out_type=Data(batch_shape_meta=[B?,F|1000]) (construction stack 'readout')>,
'target_embed': <_TemplateLayer(LinearLayer)(:template:linear) output/'target_embed' out_type=Data(batch_shape_meta=[B?,F|621]) (construction stack 'iLMT_0_FF_0')>,
'weight_feedback': <_TemplateLayer(LinearLayer)(:template:linear) output/'weight_feedback' out_type=Data(batch_shape_meta=[T|'spatial:0:lstm0_pool',B?,F|1024]) (construction stack 'energy_in')>,
'zero_att': <_TemplateLayer(EvalLayer)(:template:eval) output/'zero_att' out_type=Data(batch_shape_meta=[B?,F|1024]) (construction stack 'iLMT_0_FF_0')>}
Collected (unique) exceptions during template construction:
(Note that many of these can be ignored, or are expected.)
EXCEPTION while constructing layer 'accum_att_weights'
NetworkConstructionDependencyLoopException: <TFNetwork 'root/output(rec-subnet)' parent_net=<TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>: Error: There is a dependency loop on layer 'accum_att_weights'.
Construction stack (most recent first):
accum_att_weights
weight_feedback
energy_in
energy_tanh
energy
att_weights
att0
att
zero_att
iLMT_0_FF_0
iLMT_0_readout_in
iLMT_0_readout
iLMT_0_output_prob
Exception occurred during output-net construction of layer 'iLMT_0_readout_in'.
Exception occurred during output-net construction of layer 'iLMT_0_readout'.
Exception occurred during output-net construction of layer 'iLMT_0_output_prob'.
Exception creating layer root/'output' of class RecLayer with opts:
{'_name': 'output',
'_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'_target_layers': {'bpe': <SourceLayer 'data:bpe' out_type=Data(dtype='int32', sparse=True, dim=534, available_for_inference=False, batch_shape_meta=[B,T|'time:var:extern_data:bpe'])>},
'_time_dim_tag': DimensionTag(kind='spatial', description='time:var:extern_data:bpe', id=23324237015024),
'max_seq_len': <tf.Tensor 'max_seq_len_encoder:0' shape=() dtype=int32>,
'n_out': <class 'returnn.util.basic.NotSpecified'>,
'name': 'output',
'network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'output': Data(name='output_output', dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B]),
'sources': [],
'target': 'bpe',
'unit': <_SubnetworkRecCell 'root/output(rec-subnet)'>}
Unhandled exception <class 'AssertionError'> in thread <_MainThread(MainThread, started 23325610391296)>, proc 2876.
...
EXCEPTION
Traceback (most recent call last):
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/rnn.py", line 11, in <module>
line: main()
locals:
main = <local> <function main at 0x1536dcbee310>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/__main__.py", line 659, in main
line: execute_main_task()
locals:
execute_main_task = <global> <function execute_main_task at 0x1536dcbee1f0>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/__main__.py", line 457, in execute_main_task
line: engine.init_train_from_config(config, train_data, dev_data, eval_data)
locals:
engine = <global> <returnn.tf.engine.Engine object at 0x153698aa2ac0>
engine.init_train_from_config = <global> <bound method Engine.init_train_from_config of <returnn.tf.engine.Engine object at 0x153698aa2ac0>>
config = <global> <returnn.config.Config object at 0x1536ea9d3280>
train_data = <global> <ExternSprintDataset 'train' epoch=1>
dev_data = <global> <ExternSprintDataset 'dev' epoch=1>
eval_data = <global> None
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/engine.py", line 1031, in Engine.init_train_from_config
line: self.init_network_from_config(config)
locals:
self = <local> <returnn.tf.engine.Engine object at 0x153698aa2ac0>
self.init_network_from_config = <local> <bound method Engine.init_network_from_config of <returnn.tf.engine.Engine object at 0x153698aa2ac0>>
config = <local> <returnn.config.Config object at 0x1536ea9d3280>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/engine.py", line 1096, in Engine.init_network_from_config
line: self._init_network(net_desc=net_dict, epoch=self.epoch)
locals:
self = <local> <returnn.tf.engine.Engine object at 0x153698aa2ac0>
self._init_network = <local> <bound method Engine._init_network of <returnn.tf.engine.Engine object at 0x153698aa2ac0>>
net_desc = <not found>
net_dict = <local> {'conv0': {'L2': 0.0005, 'activation': None, 'class': 'conv', 'filter_size': (3, 3), 'from': 'source0', 'n_out': 32, 'padding': 'same', 'with_bias': True}, 'conv0p': {'class': 'pool', 'from': 'conv0', 'mode': 'max', 'padding': 'same', 'pool_size': (1, 2), 'trainable': False}, 'conv1': {'L2': 0.00..., len = 20
epoch = <local> None
self.epoch = <local> 1
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/engine.py", line 1275, in Engine._init_network
line: self.network, self.updater = self.create_network(
config=self.config,
extern_data=extern_data,
rnd_seed=net_random_seed,
train_flag=train_flag, eval_flag=self.use_eval_flag, search_flag=self.use_search_flag,
initial_learning_rate=getattr(self, "initial_learning_rate", None),
net_dict=net_desc)
locals:
self = <local> <returnn.tf.engine.Engine object at 0x153698aa2ac0>
self.network = <local> None
self.updater = <local> None
self.create_network = <local> <bound method Engine.create_network of <class 'returnn.tf.engine.Engine'>>
config = <not found>
self.config = <local> <returnn.config.Config object at 0x1536ea9d3280>
extern_data = <local> <ExternData data={'bpe': Data(name='bpe', dtype='int32', sparse=True, dim=534, available_for_inference=False, batch_shape_meta=[B,T|'time:var:extern_data:bpe']), 'data': Data(name='data', batch_shape_meta=[B,T|'time:var:extern_data:data',F|40])}>
rnd_seed = <not found>
net_random_seed = <local> 1
train_flag = <local> <tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>
eval_flag = <not found>
self.use_eval_flag = <local> True
search_flag = <not found>
self.use_search_flag = <local> False
initial_learning_rate = <not found>
getattr = <builtin> <built-in function getattr>
net_dict = <not found>
net_desc = <local> {'conv0': {'L2': 0.0005, 'activation': None, 'class': 'conv', 'filter_size': (3, 3), 'from': 'source0', 'n_out': 32, 'padding': 'same', 'with_bias': True}, 'conv0p': {'class': 'pool', 'from': 'conv0', 'mode': 'max', 'padding': 'same', 'pool_size': (1, 2), 'trainable': False}, 'conv1': {'L2': 0.00..., len = 20
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/engine.py", line 1316, in Engine.create_network
line: network.construct_from_dict(net_dict)
locals:
network = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
network.construct_from_dict = <local> <bound method TFNetwork.construct_from_dict of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <local> {'conv0': {'L2': 0.0005, 'activation': None, 'class': 'conv', 'filter_size': (3, 3), 'from': 'source0', 'n_out': 32, 'padding': 'same', 'with_bias': True}, 'conv0p': {'class': 'pool', 'from': 'conv0', 'mode': 'max', 'padding': 'same', 'pool_size': (1, 2), 'trainable': False}, 'conv1': {'L2': 0.00..., len = 20
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 564, in TFNetwork.construct_from_dict
line: self.construct_layer(net_dict, name, get_layer=get_layer)
locals:
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <local> {'conv0': {'L2': 0.0005, 'activation': None, 'class': 'conv', 'filter_size': (3, 3), 'from': 'source0', 'n_out': 32, 'padding': 'same', 'with_bias': True}, 'conv0p': {'class': 'pool', 'from': 'conv0', 'mode': 'max', 'padding': 'same', 'pool_size': (1, 2), 'trainable': False}, 'conv1': {'L2': 0.00..., len = 20
name = <local> 'decision', len = 8
get_layer = <local> None
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 883, in TFNetwork.construct_layer
line: layer_class.transform_config_dict(layer_desc, network=net, get_layer=get_layer)
locals:
layer_class = <local> <class 'returnn.tf.layers.rec.DecideLayer'>
layer_class.transform_config_dict = <local> <bound method BaseChoiceLayer.transform_config_dict of <class 'returnn.tf.layers.rec.DecideLayer'>>
layer_desc = <local> {'loss': 'edit_distance', 'target': 'bpe', '_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'decision'}
network = <not found>
net = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x153698bfaf70>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 4375, in BaseChoiceLayer.transform_config_dict
line: super(BaseChoiceLayer, cls).transform_config_dict(d, network=network, get_layer=get_layer)
locals:
super = <builtin> <class 'super'>
BaseChoiceLayer = <global> <class 'returnn.tf.layers.rec.BaseChoiceLayer'>
cls = <local> <class 'returnn.tf.layers.rec.DecideLayer'>
transform_config_dict = <not found>
d = <local> {'loss': 'edit_distance', 'target': 'bpe', '_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'decision'}
network = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x153698bfaf70>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 466, in LayerBase.transform_config_dict
line: d["sources"] = [
get_layer(src_name)
for src_name in src_names
if not src_name == "none"]
locals:
d = <local> {'loss': 'edit_distance', 'target': 'bpe', '_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'decision'}
get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x153698bfaf70>
src_name = <not found>
src_names = <local> ['output'], _[0]: {len = 6}
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 467, in <listcomp>
line: get_layer(src_name)
locals:
get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x153698bfaf70>
src_name = <local> 'output', len = 6
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 770, in TFNetwork.construct_layer.<locals>.get_layer
line: return self.construct_layer(net_dict=net_dict, name=src_name, get_layer=get_layer, add_layer=add_layer)
locals:
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <local> {'conv0': {'L2': 0.0005, 'activation': None, 'class': 'conv', 'filter_size': (3, 3), 'from': 'source0', 'n_out': 32, 'padding': 'same', 'with_bias': True}, 'conv0p': {'class': 'pool', 'from': 'conv0', 'mode': 'max', 'padding': 'same', 'pool_size': (1, 2), 'trainable': False}, 'conv1': {'L2': 0.00..., len = 20
name = <not found>
src_name = <local> 'output', len = 6
get_layer = <local> <function TFNetwork.construct_layer.<locals>.get_layer at 0x153698bfaf70>
add_layer = <local> <bound method TFNetwork.add_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 890, in TFNetwork.construct_layer
line: return add_layer(name=name_with_prefix, layer_class=layer_class, **layer_desc)
locals:
add_layer = <local> <bound method TFNetwork.add_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'output', len = 6
name_with_prefix = <local> 'output', len = 6
layer_class = <local> <class 'returnn.tf.layers.rec.RecLayer'>
layer_desc = <local> {'max_seq_len': <tf.Tensor 'max_seq_len_encoder:0' shape=() dtype=int32>, 'target': 'bpe', '_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'output', 'n_out': <class 'returnn.util.basic.NotSpecified'>, 'sources': [], '_target_layers': {'bpe': <..., len = 9
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 1045, in TFNetwork.add_layer
line: layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
locals:
layer = <not found>
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self._create_layer = <local> <bound method TFNetwork._create_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'output', len = 6
layer_class = <local> <class 'returnn.tf.layers.rec.RecLayer'>
layer_desc = <local> {'max_seq_len': <tf.Tensor 'max_seq_len_encoder:0' shape=() dtype=int32>, 'target': 'bpe', '_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'output', 'n_out': <class 'returnn.util.basic.NotSpecified'>, 'sources': [], '_target_layers': {'bpe': <..., len = 9
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 967, in TFNetwork._create_layer
line: layer = layer_class(**layer_desc)
locals:
layer = <not found>
layer_class = <local> <class 'returnn.tf.layers.rec.RecLayer'>
layer_desc = <local> {'max_seq_len': <tf.Tensor 'max_seq_len_encoder:0' shape=() dtype=int32>, 'target': 'bpe', '_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'output', 'n_out': <class 'returnn.util.basic.NotSpecified'>, 'sources': [], '_target_layers': {'bpe': <..., len = 12
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 236, in RecLayer.__init__
line: y = self._get_output_subnet_unit(self.cell)
locals:
y = <not found>
self = <local> <RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])>
self._get_output_subnet_unit = <local> <bound method RecLayer._get_output_subnet_unit of <RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])>>
self.cell = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 916, in RecLayer._get_output_subnet_unit
line: output = cell.get_output()
locals:
output = <not found>
cell = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
cell.get_output = <local> <bound method _SubnetworkRecCell.get_output of <_SubnetworkRecCell 'root/output(rec-subnet)'>>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 2488, in _SubnetworkRecCell.get_output
line: self._construct_output_layers_moved_out(
loop_accumulated=self.final_acc_tas_dict, seq_len=seq_len,
extra_output_layers=extra_output_layers, final_net_vars=final_net_vars)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self._construct_output_layers_moved_out = <local> <bound method _SubnetworkRecCell._construct_output_layers_moved_out of <_SubnetworkRecCell 'root/output(rec-subnet)'>>
loop_accumulated = <not found>
self.final_acc_tas_dict = <local> {'output_FF_0': <tf.TensorArray 'output/rec/subnet_base/acc_ta_output_FF_0'>, 'output_att': <tf.TensorArray 'output/rec/subnet_base/acc_ta_output_att'>}
seq_len = <local> <tf.Tensor 'output/rec/subnet_base/check_seq_len_batch_size/check_input_dim/identity_with_dim_check:0' shape=(?,) dtype=int32>
extra_output_layers = <local> {'output'}, len = 1
final_net_vars = <local> ([<tf.Tensor 'output/rec/while/Exit_1:0' shape=(?, ?, 1) dtype=float32>, <tf.Tensor 'output/rec/while/Exit_2:0' shape=(?, 1024) dtype=float32>], [])
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3266, in _SubnetworkRecCell._construct_output_layers_moved_out
line: get_layer(layer_name)
locals:
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
layer_name = <local> 'iLMT_0_output_prob', len = 18
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3254, in _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer
line: return self.output_layers_net.construct_layer(self.net_dict, name=name, get_layer=get_layer)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self.output_layers_net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.output_layers_net.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
self.net_dict = <local> {'FF_0': {'L2': 0.0005, 'activation': 'tanh', 'class': 'linear', 'from': ['prev:target_embed', 'prev:prev_1_target_embed', 'prev:prev_2_target_embed', 'prev:att'], 'n_out': 1024, 'with_bias': True}, 'accum_att_weights': {'class': 'eval', 'eval': 'source(0) + source(1) * source(2) * 0.5', 'from': ..., len = 22
name = <local> 'iLMT_0_output_prob', len = 18
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 883, in TFNetwork.construct_layer
line: layer_class.transform_config_dict(layer_desc, network=net, get_layer=get_layer)
locals:
layer_class = <local> <class 'returnn.tf.layers.basic.SoftmaxLayer'>
layer_class.transform_config_dict = <local> <bound method LayerBase.transform_config_dict of <class 'returnn.tf.layers.basic.SoftmaxLayer'>>
layer_desc = <local> {'L2': 0.0005, 'dropout': 0.3, 'loss': 'ce', 'loss_opts': {'label_smoothing': 0.1, 'scale': 1.0}, 'target': 'bpe', '_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:b..., len = 7
network = <not found>
net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 466, in LayerBase.transform_config_dict
line: d["sources"] = [
get_layer(src_name)
for src_name in src_names
if not src_name == "none"]
locals:
d = <local> {'L2': 0.0005, 'dropout': 0.3, 'loss': 'ce', 'loss_opts': {'label_smoothing': 0.1, 'scale': 1.0}, 'target': 'bpe', '_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:b..., len = 7
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
src_name = <not found>
src_names = <local> ['iLMT_0_readout'], _[0]: {len = 14}
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 467, in <listcomp>
line: get_layer(src_name)
locals:
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
src_name = <local> 'iLMT_0_readout', len = 14
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3254, in _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer
line: return self.output_layers_net.construct_layer(self.net_dict, name=name, get_layer=get_layer)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self.output_layers_net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.output_layers_net.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
self.net_dict = <local> {'FF_0': {'L2': 0.0005, 'activation': 'tanh', 'class': 'linear', 'from': ['prev:target_embed', 'prev:prev_1_target_embed', 'prev:prev_2_target_embed', 'prev:att'], 'n_out': 1024, 'with_bias': True}, 'accum_att_weights': {'class': 'eval', 'eval': 'source(0) + source(1) * source(2) * 0.5', 'from': ..., len = 22
name = <local> 'iLMT_0_readout', len = 14
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 883, in TFNetwork.construct_layer
line: layer_class.transform_config_dict(layer_desc, network=net, get_layer=get_layer)
locals:
layer_class = <local> <class 'returnn.tf.layers.basic.ReduceOutLayer'>
layer_class.transform_config_dict = <local> <bound method LayerBase.transform_config_dict of <class 'returnn.tf.layers.basic.ReduceOutLayer'>>
layer_desc = <local> {'mode': 'max', 'num_pieces': 2, '_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': '...
network = <not found>
net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 466, in LayerBase.transform_config_dict
line: d["sources"] = [
get_layer(src_name)
for src_name in src_names
if not src_name == "none"]
locals:
d = <local> {'mode': 'max', 'num_pieces': 2, '_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': '...
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
src_name = <not found>
src_names = <local> ['iLMT_0_readout_in'], _[0]: {len = 17}
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 467, in <listcomp>
line: get_layer(src_name)
locals:
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
src_name = <local> 'iLMT_0_readout_in', len = 17
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3254, in _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer
line: return self.output_layers_net.construct_layer(self.net_dict, name=name, get_layer=get_layer)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self.output_layers_net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.output_layers_net.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
self.net_dict = <local> {'FF_0': {'L2': 0.0005, 'activation': 'tanh', 'class': 'linear', 'from': ['prev:target_embed', 'prev:prev_1_target_embed', 'prev:prev_2_target_embed', 'prev:att'], 'n_out': 1024, 'with_bias': True}, 'accum_att_weights': {'class': 'eval', 'eval': 'source(0) + source(1) * source(2) * 0.5', 'from': ..., len = 22
name = <local> 'iLMT_0_readout_in', len = 17
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 883, in TFNetwork.construct_layer
line: layer_class.transform_config_dict(layer_desc, network=net, get_layer=get_layer)
locals:
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_class.transform_config_dict = <local> <bound method LayerBase.transform_config_dict of <class 'returnn.tf.layers.basic.LinearLayer'>>
layer_desc = <local> {'activation': None, 'n_out': 1000, 'with_bias': True, '_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dt...
network = <not found>
net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 466, in LayerBase.transform_config_dict
line: d["sources"] = [
get_layer(src_name)
for src_name in src_names
if not src_name == "none"]
locals:
d = <local> {'activation': None, 'n_out': 1000, 'with_bias': True, '_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dt...
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
src_name = <not found>
src_names = <local> ['iLMT_0_FF_0', 'prev:target_embed', 'zero_att'], _[0]: {len = 11}
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 467, in <listcomp>
line: get_layer(src_name)
locals:
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
src_name = <local> 'iLMT_0_FF_0', len = 11
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3254, in _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer
line: return self.output_layers_net.construct_layer(self.net_dict, name=name, get_layer=get_layer)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self.output_layers_net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.output_layers_net.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
self.net_dict = <local> {'FF_0': {'L2': 0.0005, 'activation': 'tanh', 'class': 'linear', 'from': ['prev:target_embed', 'prev:prev_1_target_embed', 'prev:prev_2_target_embed', 'prev:att'], 'n_out': 1024, 'with_bias': True}, 'accum_att_weights': {'class': 'eval', 'eval': 'source(0) + source(1) * source(2) * 0.5', 'from': ..., len = 22
name = <local> 'iLMT_0_FF_0', len = 11
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x1536980fe430>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 890, in TFNetwork.construct_layer
line: return add_layer(name=name_with_prefix, layer_class=layer_class, **layer_desc)
locals:
add_layer = <local> <bound method TFNetwork.add_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'iLMT_0_FF_0', len = 11
name_with_prefix = <local> 'iLMT_0_FF_0', len = 11
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'L2': 0.0005, 'activation': 'tanh', 'n_out': 1024, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer outp..., len = 8
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 1045, in TFNetwork.add_layer
line: layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
locals:
layer = <not found>
self = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self._create_layer = <local> <bound method TFNetwork._create_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(dtype='int32', sparse=True, dim=534, batch_shape_meta=[T|'time:var:extern_data:bpe',B])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'iLMT_0_FF_0', len = 11
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'L2': 0.0005, 'activation': 'tanh', 'n_out': 1024, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer outp..., len = 8
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 967, in TFNetwork._create_layer
line: layer = layer_class(**layer_desc)
locals:
layer = <not found>
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'L2': 0.0005, 'activation': 'tanh', 'n_out': 1024, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer outp..., len = 11
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/basic.py", line 1456, in LinearLayer.__init__
line: weights = self.add_param(tf_compat.v1.get_variable(
name="W", shape=weights_shape, dtype=tf.float32, initializer=fwd_weights_initializer))
locals:
weights = <not found>
self = <local> <LinearLayer output/'iLMT_0_FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>
self.add_param = <local> <bound method LayerBase.add_param of <LinearLayer output/'iLMT_0_FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>>
tf_compat = <global> <module 'returnn.tf.compat' from '/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/compat.py'>
tf_compat.v1 = <global> <module 'tensorflow._api.v2.compat.v1' from '/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/_api/v2/compat/v1/__init__.py'>
tf_compat.v1.get_variable = <global> <function get_variable at 0x1536ba483550>
name = <not found>
shape = <not found>
weights_shape = <local> (2887, 1024)
dtype = <not found>
tf = <global> <module 'tensorflow' from '/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/__init__.py'>
tf.float32 = <global> tf.float32
initializer = <not found>
fwd_weights_initializer = <local> <tensorflow.python.ops.init_ops.GlorotUniform object at 0x153698086070>
File "/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/python/ops/variable_scope.py", line 1556, in get_variable
line: return get_variable_scope().get_variable(
_get_default_variable_store(),
name,
shape=shape,
dtype=dtype,
initializer=initializer,
regularizer=regularizer,
trainable=trainable,
collections=collections,
caching_device=caching_device,
partitioner=partitioner,
validate_shape=validate_shape,
use_resource=use_resource,
custom_getter=custom_getter,
constraint=constraint,
synchronization=synchronization,
aggregation=aggregation)
locals:
get_variable_scope = <global> <function get_variable_scope at 0x1536ba483310>
get_variable = <global> <function get_variable at 0x1536ba483550>
_get_default_variable_store = <global> <function _get_default_variable_store at 0x1536ba4833a0>
name = <local> 'W'
shape = <local> (2887, 1024)
dtype = <local> tf.float32
initializer = <local> <tensorflow.python.ops.init_ops.GlorotUniform object at 0x153698086070>
regularizer = <local> None
trainable = <local> None
collections = <local> None
caching_device = <local> None
partitioner = <local> None
validate_shape = <local> True
use_resource = <local> None
custom_getter = <local> None
constraint = <local> None
synchronization = <local> <VariableSynchronization.AUTO: 0>
aggregation = <local> <VariableAggregation.NONE: 0>
File "/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/python/ops/variable_scope.py", line 1299, in VariableScope.get_variable
line: return var_store.get_variable(
full_name,
shape=shape,
dtype=dtype,
initializer=initializer,
regularizer=regularizer,
reuse=reuse,
trainable=trainable,
collections=collections,
caching_device=caching_device,
partitioner=partitioner,
validate_shape=validate_shape,
use_resource=use_resource,
custom_getter=custom_getter,
constraint=constraint,
synchronization=synchronization,
aggregation=aggregation)
locals:
var_store = <local> <tensorflow.python.ops.variable_scope._VariableStore object at 0x153698dcaa60>
var_store.get_variable = <local> <bound method _VariableStore.get_variable of <tensorflow.python.ops.variable_scope._VariableStore object at 0x153698dcaa60>>
full_name = <local> 'output/rec/iLMT_0_FF_0/W', len = 24
shape = <local> (2887, 1024)
dtype = <local> tf.float32
initializer = <local> <tensorflow.python.ops.init_ops.GlorotUniform object at 0x153698086070>
regularizer = <local> None
reuse = <local> <_ReuseMode.AUTO_REUSE: 1>
trainable = <local> None
collections = <local> None
caching_device = <local> None
partitioner = <local> None
validate_shape = <local> True
use_resource = <local> None
custom_getter = <local> <function ReuseParams.get_variable_scope.<locals>._variable_custom_getter at 0x15369807c5e0>
constraint = <local> None
synchronization = <local> <VariableSynchronization.AUTO: 0>
aggregation = <local> <VariableAggregation.NONE: 0>
File "/work/tools/asr/python/3.8.0_tf_2.3-v1-generic+cuda10.1/lib/python3.8/site-packages/tensorflow/python/ops/variable_scope.py", line 552, in _VariableStore.get_variable
line: return custom_getter(**custom_getter_kwargs)
locals:
custom_getter = <local> <function ReuseParams.get_variable_scope.<locals>._variable_custom_getter at 0x15369807c5e0>
custom_getter_kwargs = <local> {'getter': <function _VariableStore.get_variable.<locals>._true_getter at 0x15369807c670>, 'name': 'output/rec/iLMT_0_FF_0/W', 'shape': (2887, 1024), 'dtype': tf.float32, 'initializer': <tensorflow.python.ops.init_ops.GlorotUniform object at 0x153698086070>, 'regularizer': None, 'reuse': <_ReuseM..., len = 16
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 1801, in ReuseParams.get_variable_scope.<locals>._variable_custom_getter
line: return self.variable_custom_getter(base_layer=base_layer, **kwargs_)
locals:
self = <local> <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bp...
self.variable_custom_getter = <local> <bound method ReuseParams.variable_custom_getter of <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_ty...
base_layer = <local> <LinearLayer output/'iLMT_0_FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>
kwargs_ = <local> {'getter': <function _VariableStore.get_variable.<locals>._true_getter at 0x15369807c670>, 'name': 'output/rec/iLMT_0_FF_0/W', 'shape': (2887, 1024), 'dtype': tf.float32, 'initializer': <tensorflow.python.ops.init_ops.GlorotUniform object at 0x153698086070>, 'regularizer': None, 'reuse': <_ReuseM..., len = 16
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 1839, in ReuseParams.variable_custom_getter
line: return self.param_map[param_name].variable_custom_getter(
getter=getter, name=name, base_layer=base_layer, **kwargs)
locals:
self = <local> <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bp...
self.param_map = <local> {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>}
param_name = <local> 'W'
variable_custom_getter = <not found>
getter = <local> <function _VariableStore.get_variable.<locals>._true_getter at 0x15369807c670>
name = <local> 'output/rec/iLMT_0_FF_0/W', len = 24
base_layer = <local> <LinearLayer output/'iLMT_0_FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>
kwargs = <local> {'shape': (2887, 1024), 'dtype': tf.float32, 'initializer': <tensorflow.python.ops.init_ops.GlorotUniform object at 0x153698086070>, 'regularizer': None, 'reuse': <_ReuseMode.AUTO_REUSE: 1>, 'trainable': True, 'collections': None, 'caching_device': None, 'partitioner': None, 'validate_shape': Tru..., len = 14
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 1843, in ReuseParams.variable_custom_getter
line: assert param_name in self.reuse_layer.params
locals:
param_name = <local> 'W'
self = <local> <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>, map None>
self.reuse_layer = <local> <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:bpe',B,F|1024])>
self.reuse_layer.params = <local> {}
AssertionError
Your error/log does not directly correspond to the example net you posted initially. Can you post the error you get exactly for your example?
Regarding the error itself: From a first glance, maybe this is because the layer FF_0 is inside the loop and the reuse params logic when it accesses the layer tries to get it from outside.
Actually, the whole code of ReuseParams
is quite fragile and partly hacky. I'm not really sure that we really need to access the other layer (reuse_layer
) at all here. We should be able to just use the name scope, i.e. to infer the name scope and that way access the variable.
This would need some cleanup and reimplementation. Unfortunately I don't have much time currently to do that. PRs are welcome. But I'm a bit afraid that this needs some in depth knowledge about RETURNN, and someone should help here, or at least overlook it. @patrick-wilken maybe?
Maybe you are also fine with some workaround for now. Regarding some workaround: You could just set some custom getter function. Sth like this (untested, but I hope you get the idea):
def get_var(name):
with reuse_name_scope("", absolute=True):
return tf.get_variable(name)
...
'reuse_params': {'map': {'W': {'custom': lambda **_kwargs: get_var("output/FF_0/W")},
'b': {'custom': lambda **_kwargs: get_var("output/FF_0/b")}}}},
I will post the logs from the upper examples later, since im getting some unrelated error for the second case that didn't appear before. For the first case reuse_params is working like for the demos in test_TFNetworkLayer.py. For the second one, i have got the same error like in the upper log, but need to reproduce it.
Thank you for the idea with an explicit function, I have also seen in the slack that one can use prefix extra.any_name:crf
to share the weights, but didn't find any annotation about it's usage in the returnn docs-website
This is the log for second case with two outputs
Train data:
input: 9 x 1
output: {'classes': (2, 1), 'data': (9, 2)}
Task12AXDataset, sequences: 1000, frames: unknown
Dev data:
Task12AXDataset, sequences: 100, frames: unknown
Device not set explicitly, and we found a GPU, which we will use.
Setup TF session with options {'log_device_placement': False, 'device_count': {'GPU': 1}} ...
layer root/'data' output: Data(name='data', batch_shape_meta=[B,T|'time:var:extern_data:data',F|9])
layer root/'output' output: Data(name='output_output', batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])
Rec layer 'output' (search False, train 'globals/train_flag:0') sub net:
Input layers moved out of loop: (#: 0)
None
Output layers moved out of loop: (#: 2)
output1
FF_1
Layers in loop: (#: 3)
output
FF_0
input
Unused layers: (#: 0)
None
layer root/output(rec-subnet)/'data:source' output: Data(name='data', batch_shape_meta=[B,F|9])
layer root/output(rec-subnet)/'input' output: Data(name='input_output', batch_shape_meta=[B,F|11])
layer root/output(rec-subnet)/'FF_0' output: Data(name='FF_0_output', batch_shape_meta=[B,F|10])
Exception occurred during in-loop construction of layer 'classes'.
We had previous exceptions at template construction, which got resolved, but maybe sth is wrong.
Template network (check out types / shapes):
{'FF_0': <_TemplateLayer(LinearLayer)(:template:linear) output/'FF_0' out_type=Data(batch_shape_meta=[B?,F|10]) (construction stack 'output')>,
'FF_1': <_TemplateLayer(LinearLayer)(:template:linear) output/'FF_1' out_type=Data(batch_shape_meta=[B?,F|10]) (construction stack 'output1')>,
'data:classes': <_TemplateLayer(SourceLayer)(:template:source) output/'data:classes' out_type=Data(dtype='int32', sparse=True, dim=2, available_for_inference=False, batch_shape_meta=[B]) (construction stack 'output')>,
'data:source': <_TemplateLayer(SourceLayer)(:template:source) output/'data:source' out_type=Data(batch_shape_meta=[B,F|9]) (construction stack 'input')>,
'input': <_TemplateLayer(CopyLayer)(:template:copy) output/'input' out_type=Data(batch_shape_meta=[B,F|11]) (construction stack 'FF_0')>,
'output': <_TemplateLayer(SoftmaxLayer)(:template:softmax) output/'output' out_type=Data(batch_shape_meta=[B?,F|2]) (construction stack None)>,
'output1': <_TemplateLayer(SoftmaxLayer)(:template:softmax) output/'output1' out_type=Data(batch_shape_meta=[B?,F|2]) (construction stack None)>}
Collected (unique) exceptions during template construction:
(Note that many of these can be ignored, or are expected.)
EXCEPTION while constructing layer 'input'
NetworkConstructionDependencyLoopException: <TFNetwork 'root/output(rec-subnet)' parent_net=<TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>: Error: There is a dependency loop on layer 'output'.
Construction stack (most recent first):
input
FF_0
output
layer root/output(rec-subnet)/'data:classes' output: Data(name='classes', dtype='int32', sparse=True, dim=2, available_for_inference=False, batch_shape_meta=[B])
layer root/output(rec-subnet)/'output' output: Data(name='output_output', batch_shape_meta=[B,F|2])
layer root/output(rec-subnet-output)/'input' output: Data(name='input_output', batch_shape_meta=[T|'time:var:extern_data:data',B,F|11])
layer root/output(rec-subnet-output)/'FF_0' output: Data(name='FF_0_output', batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])
layer root/output(rec-subnet-output)/'FF_1' output: Data(name='FF_1_output', batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])
Exception creating layer root/output(rec-subnet-output)/'FF_1' of class LinearLayer with opts:
{'_name': 'FF_1',
'_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'activation': 'tanh',
'n_out': 10,
'name': 'FF_1',
'network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'output': Data(name='FF_1_output', batch_shape_meta=[T|'time:var:extern_data:data',B,F|10]),
'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>}>,
'sources': [<InternalLayer output/'input' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|11])>]}
Exception occurred during output-net construction of layer 'FF_1'.
Exception occurred during output-net construction of layer 'output1'.
Exception creating layer root/'output' of class RecLayer with opts:
{'_name': 'output',
'_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'_time_dim_tag': DimensionTag(kind='spatial', description='time:var:extern_data:data', id=140167542826432),
'n_out': <class 'returnn.util.basic.NotSpecified'>,
'name': 'output',
'network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>,
'output': Data(name='output_output', batch_shape_meta=[T|'time:var:extern_data:data',B,F|2]),
'sources': [<SourceLayer 'data' out_type=Data(batch_shape_meta=[B,T|'time:var:extern_data:data',F|9])>],
'unit': <_SubnetworkRecCell 'root/output(rec-subnet)'>}
Unhandled exception <class 'AssertionError'> in thread <_MainThread(MainThread, started 140169104250624)>, proc 28292.
Thread current, main, <_MainThread(MainThread, started 140169104250624)>:
(Excluded thread.)
That were all threads.
EXCEPTION
Traceback (most recent call last):
File "returnn/rnn.py", line 11, in <module>
line: main()
locals:
main = <local> <function main at 0x7f7b9bf03ae8>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/__main__.py", line 659, in main
line: execute_main_task()
locals:
execute_main_task = <global> <function execute_main_task at 0x7f7b9bf039d8>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/__main__.py", line 457, in execute_main_task
line: engine.init_train_from_config(config, train_data, dev_data, eval_data)
locals:
engine = <global> <returnn.tf.engine.Engine object at 0x7f7ba0c28978>
engine.init_train_from_config = <global> <bound method Engine.init_train_from_config of <returnn.tf.engine.Engine object at 0x7f7ba0c28978>>
config = <global> <returnn.config.Config object at 0x7f7ba9940358>
train_data = <global> <Task12AXDataset 'train' epoch=None>
dev_data = <global> <Task12AXDataset 'dev' epoch=None>
eval_data = <global> None
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/engine.py", line 1031, in Engine.init_train_from_config
line: self.init_network_from_config(config)
locals:
self = <local> <returnn.tf.engine.Engine object at 0x7f7ba0c28978>
self.init_network_from_config = <local> <bound method Engine.init_network_from_config of <returnn.tf.engine.Engine object at 0x7f7ba0c28978>>
config = <local> <returnn.config.Config object at 0x7f7ba9940358>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/engine.py", line 1096, in Engine.init_network_from_config
line: self._init_network(net_desc=net_dict, epoch=self.epoch)
locals:
self = <local> <returnn.tf.engine.Engine object at 0x7f7ba0c28978>
self._init_network = <local> <bound method Engine._init_network of <returnn.tf.engine.Engine object at 0x7f7ba0c28978>>
net_desc = <not found>
net_dict = <local> {'output': {'class': 'rec', 'from': 'data', 'unit': {'input': {'class': 'copy', 'from': ['prev:output', 'data:source']}, 'FF_0': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10}, 'FF_1': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10, 'reuse_para...
epoch = <local> None
self.epoch = <local> 1
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/engine.py", line 1281, in Engine._init_network
line: self.network, self.updater = self.create_network(
config=self.config,
extern_data=extern_data,
rnd_seed=net_random_seed,
train_flag=train_flag, eval_flag=self.use_eval_flag, search_flag=self.use_search_flag,
initial_learning_rate=getattr(self, "initial_learning_rate", None),
net_dict=net_desc)
locals:
self = <local> <returnn.tf.engine.Engine object at 0x7f7ba0c28978>
self.network = <local> None
self.updater = <local> None
self.create_network = <local> <bound method Engine.create_network of <class 'returnn.tf.engine.Engine'>>
config = <not found>
self.config = <local> <returnn.config.Config object at 0x7f7ba9940358>
extern_data = <local> <ExternData data={'classes': Data(name='classes', dtype='int32', sparse=True, dim=2, available_for_inference=False, batch_shape_meta=[B,T|'time:var:extern_data:classes']), 'dat
a': Data(name='data', batch_shape_meta=[B,T|'time:var:extern_data:data',F|9])}>
rnd_seed = <not found>
net_random_seed = <local> 1
train_flag = <local> <tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>
eval_flag = <not found>
self.use_eval_flag = <local> True
search_flag = <not found>
self.use_search_flag = <local> False
initial_learning_rate = <not found>
getattr = <builtin> <built-in function getattr>
net_dict = <not found>
net_desc = <local> {'output': {'class': 'rec', 'from': 'data', 'unit': {'input': {'class': 'copy', 'from': ['prev:output', 'data:source']}, 'FF_0': {'activation': 'tanh', 'class': 'linear', 'from':
['input'], 'n_out': 10}, 'FF_1': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10, 'reuse_para...
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/engine.py", line 1316, in Engine.create_network
line: network.construct_from_dict(net_dict)
locals:
network = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
network.construct_from_dict = <local> <bound method TFNetwork.construct_from_dict of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <local> {'output': {'class': 'rec', 'from': 'data', 'unit': {'input': {'class': 'copy', 'from': ['prev:output', 'data:source']}, 'FF_0': {'activation': 'tanh', 'class': 'linear', 'from':
['input'], 'n_out': 10}, 'FF_1': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10, 'reuse_para...
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 564, in TFNetwork.construct_from_dict
line: self.construct_layer(net_dict, name, get_layer=get_layer)
locals:
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
net_dict = <local> {'output': {'class': 'rec', 'from': 'data', 'unit': {'input': {'class': 'copy', 'from': ['prev:output', 'data:source']}, 'FF_0': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10}, 'FF_1': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10, 'reuse_para...
name = <local> 'output', len = 6
get_layer = <local> None
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 890, in TFNetwork.construct_layer
line: return add_layer(name=name_with_prefix, layer_class=layer_class, **layer_desc)
locals:
add_layer = <local> <bound method TFNetwork.add_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'output', len = 6
name_with_prefix = <local> 'output', len = 6
layer_class = <local> <class 'returnn.tf.layers.rec.RecLayer'>
layer_desc = <local> {'_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'output', 'n_out': <class 'returnn.util.basic.NotSpecified'>, 'sources': [<SourceLayer 'data' out_type=Data(batch_shape_meta=[B,T|'time:var:extern_data:data',F|9])>], '_time_dim_tag': DimensionT..., len = 6
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 1045, in TFNetwork.add_layer
line: layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
locals:
layer = <not found>
self = <local> <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self._create_layer = <local> <bound method TFNetwork._create_layer of <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'output', len = 6
layer_class = <local> <class 'returnn.tf.layers.rec.RecLayer'>
layer_desc = <local> {'_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'output', 'n_out': <class 'returnn.util.basic.NotSpecified'>, 'sources': [<SourceLayer 'data' out_type=Data(batch_shape_meta=[B,T|'time:var:extern_data:data',F|9])>], '_time_dim_tag': DimensionT..., len = 6
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 967, in TFNetwork._create_layer
line: layer = layer_class(**layer_desc)
locals:
layer = <not found>
layer_class = <local> <class 'returnn.tf.layers.rec.RecLayer'>
layer_desc = <local> {'_network': <TFNetwork 'root' train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'output', 'n_out': <class 'returnn.util.basic.NotSpecified'>, 'sources': [<SourceLayer 'data' out_type=Data(batch_shape_meta=[B,T|'time:var:extern_data:data',F|9])>], '_time_dim_tag': DimensionT..., len = 9
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 236, in RecLayer.__init__
line: y = self._get_output_subnet_unit(self.cell)
locals:
y = <not found>
self = <local> <RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])>
self._get_output_subnet_unit = <local> <bound method RecLayer._get_output_subnet_unit of <RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])>>
self.cell = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 2490, in _SubnetworkRecCell.get_output
line: self._construct_output_layers_moved_out(
loop_accumulated=self.final_acc_tas_dict, seq_len=seq_len,
extra_output_layers=extra_output_layers, final_net_vars=final_net_vars)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self._construct_output_layers_moved_out = <local> <bound method _SubnetworkRecCell._construct_output_layers_moved_out of <_SubnetworkRecCell 'root/output(rec-subnet)'>>
loop_accumulated = <not found>
self.final_acc_tas_dict = <local> {'loss_output': <tf.TensorArray 'output/rec/subnet_base/acc_ta_loss_output'>, 'error_output': <tf.TensorArray 'output/rec/subnet_base/acc_ta_error_output'>, 'output_output': <tf.TensorArray 'output/rec/subnet_base/acc_ta_output_output'>, 'output_input': <tf.TensorArray 'output/rec/subnet_base/acc...
seq_len = <local> <tf.Tensor 'output/rec/subnet_base/check_seq_len_batch_size/check_input_dim/identity_with_dim_check:0' shape=(?,) dtype=int32>
extra_output_layers = <local> {'output'}, len = 1
final_net_vars = <local> ([<tf.Tensor 'output/rec/while/Exit_1:0' shape=(?, 2) dtype=float32>], [])
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3266, in _SubnetworkRecCell._construct_output_layers_moved_out
line: get_layer(layer_name)
locals:
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x7f7b20204f28>
layer_name = <local> 'output1', len = 7
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3254, in _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer
line: return self.output_layers_net.construct_layer(self.net_dict, name=name, get_layer=get_layer)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self.output_layers_net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.output_layers_net.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
self.net_dict = <local> {'input': {'class': 'copy', 'from': ['prev:output', 'data:source']}, 'FF_0': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10}, 'FF_1': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10, 'reuse_params': {'map': {'W': {'reuse_layer': 'FF_0'}, 'b': {'r...
name = <local> 'output1', len = 7
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x7f7b20204f28>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 883, in TFNetwork.construct_layer
line: layer_class.transform_config_dict(layer_desc, network=net, get_layer=get_layer)
locals:
layer_class = <local> <class 'returnn.tf.layers.basic.SoftmaxLayer'>
layer_class.transform_config_dict = <local> <bound method LayerBase.transform_config_dict of <class 'returnn.tf.layers.basic.SoftmaxLayer'>>
layer_desc = <local> {'loss': 'ce', '_network': <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>, '_name': 'output1'}
network = <not found>
net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x7f7b20204f28>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 468, in LayerBase.transform_config_dict
line: for src_name in src_names
locals:
src_name = <not found>
src_names = <local> ['FF_1']
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 469, in <listcomp>
line: d["sources"] = [
get_layer(src_name)
for src_name in src_names
if not src_name == "none"]
locals:
d = <not found>
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x7f7b20204f28>
src_name = <local> 'FF_1'
src_names = <not found>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3254, in _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer
line: return self.output_layers_net.construct_layer(self.net_dict, name=name, get_layer=get_layer)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self.output_layers_net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.output_layers_net.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
self.net_dict = <local> {'input': {'class': 'copy', 'from': ['prev:output', 'data:source']}, 'FF_0': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10}, 'FF_1': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10, 'reuse_params': {'map': {'W': {'reuse_layer': 'FF_0'}, 'b': {'r...
name = <local> 'FF_1'
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x7f7b20204f28>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 890, in TFNetwork.construct_layer
line: return add_layer(name=name_with_prefix, layer_class=layer_class, **layer_desc)
locals:
add_layer = <local> <bound method TFNetwork.add_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'FF_1'
name_with_prefix = <local> 'FF_1'
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'activation': 'tanh', 'n_out': 10, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_typ..., len = 6
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 1045, in TFNetwork.add_layer
line: layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
locals:
layer = <not found>
self = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self._create_layer = <local> <bound method TFNetwork._create_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'FF_1'
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'activation': 'tanh', 'n_out': 10, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_typ..., len = 6
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 967, in TFNetwork._create_layer
line: layer = layer_class(**layer_desc)
locals:
layer = <not found>
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'activation': 'tanh', 'n_out': 10, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_typ..., len = 9File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 469, in <listcomp>
line: d["sources"] = [
get_layer(src_name)
for src_name in src_names
if not src_name == "none"]
locals:
d = <not found>
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x7f7b20204f28>
src_name = <local> 'FF_1'
src_names = <not found>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/rec.py", line 3254, in _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer
line: return self.output_layers_net.construct_layer(self.net_dict, name=name, get_layer=get_layer)
locals:
self = <local> <_SubnetworkRecCell 'root/output(rec-subnet)'>
self.output_layers_net = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self.output_layers_net.construct_layer = <local> <bound method TFNetwork.construct_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
self.net_dict = <local> {'input': {'class': 'copy', 'from': ['prev:output', 'data:source']}, 'FF_0': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10}, 'FF_1': {'activation': 'tanh', 'class': 'linear', 'from': ['input'], 'n_out': 10, 'reuse_params': {'map': {'W': {'reuse_layer': 'FF_0'}, 'b': {'r...
name = <local> 'FF_1'
get_layer = <local> <function _SubnetworkRecCell._construct_output_layers_moved_out.<locals>.get_layer at 0x7f7b20204f28>
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 890, in TFNetwork.construct_layer
line: return add_layer(name=name_with_prefix, layer_class=layer_class, **layer_desc)
locals:
add_layer = <local> <bound method TFNetwork.add_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'FF_1'
name_with_prefix = <local> 'FF_1'
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'activation': 'tanh', 'n_out': 10, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_typ..., len = 6
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 1045, in TFNetwork.add_layer
line: layer = self._create_layer(name=name, layer_class=layer_class, **layer_desc)
locals:
layer = <not found>
self = <local> <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>
self._create_layer = <local> <bound method TFNetwork._create_layer of <TFNetwork 'root/output(rec-subnet-output)' parent_layer=<RecLayer 'output' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|2])> train=<tf.Tensor 'globals/train_flag:0' shape=() dtype=bool>>>
name = <local> 'FF_1'
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'activation': 'tanh', 'n_out': 10, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_typ..., len = 6
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/network.py", line 967, in TFNetwork._create_layer
line: layer = layer_class(**layer_desc)
locals:
layer = <not found>
layer_class = <local> <class 'returnn.tf.layers.basic.LinearLayer'>
layer_desc = <local> {'activation': 'tanh', 'n_out': 10, 'reuse_params': <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_typ..., len = 9
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/basic.py", line 1457, in LinearLayer.__init__
line: weights = self.add_param(tf_compat.v1.get_variable(
name="W", shape=weights_shape, dtype=tf.float32, initializer=fwd_weights_initializer))
locals:
weights = <not found>
self = <local> <LinearLayer output/'FF_1' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>
self.add_param = <local> <bound method LayerBase.add_param of <LinearLayer output/'FF_1' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>>
tf_compat = <global> <module 'returnn.tf.compat' from '/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/compat.py'>
tf_compat.v1 = <global> <module 'tensorflow' from '/u/beck/programs/python/3.6.1/lib/python3.6/site-packages/tensorflow/__init__.py'>
tf_compat.v1.get_variable = <global> <function get_variable at 0x7f7b4c24b730>
name = <not found>
shape = <not found>
weights_shape = <local> (11, 10)
dtype = <not found>
tf = <global> <module 'tensorflow' from '/u/beck/programs/python/3.6.1/lib/python3.6/site-packages/tensorflow/__init__.py'>
tf.float32 = <global> tf.float32
initializer = <not found>
fwd_weights_initializer = <local> <tensorflow.python.ops.init_ops.VarianceScaling object at 0x7f7b20221fd0>
File "/u/beck/programs/python/3.6.1/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1317, in get_variable
line: return get_variable_scope().get_variable(
_get_default_variable_store(), name, shape=shape, dtype=dtype,
initializer=initializer, regularizer=regularizer, trainable=trainable,
collections=collections, caching_device=caching_device,
partitioner=partitioner, validate_shape=validate_shape,
use_resource=use_resource, custom_getter=custom_getter,
constraint=constraint)
locals:
get_variable_scope = <global> <function get_variable_scope at 0x7f7b4c24b510>
get_variable = <global> <function get_variable at 0x7f7b4c24b730>
_get_default_variable_store = <global> <function _get_default_variable_store at 0x7f7b4c24b598>
name = <local> 'W'
shape = <local> (11, 10)
dtype = <local> tf.float32
initializer = <local> <tensorflow.python.ops.init_ops.VarianceScaling object at 0x7f7b20221fd0>
regularizer = <local> None
trainable = <local> True
collections = <local> None
caching_device = <local> None
partitioner = <local> None
validate_shape = <local> True
use_resource = <local> None
custom_getter = <local> None
constraint = <local> None
File "/u/beck/programs/python/3.6.1/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1079, in VariableScope.get_variable
line: return var_store.get_variable(
full_name, shape=shape, dtype=dtype, initializer=initializer,
regularizer=regularizer, reuse=reuse, trainable=trainable,
collections=collections, caching_device=caching_device,
partitioner=partitioner, validate_shape=validate_shape,
use_resource=use_resource, custom_getter=custom_getter,
constraint=constraint)
locals:
var_store = <local> <tensorflow.python.ops.variable_scope._VariableStore object at 0x7f7b4c9969e8>
var_store.get_variable = <local> <bound method _VariableStore.get_variable of <tensorflow.python.ops.variable_scope._VariableStore object at 0x7f7b4c9969e8>>
full_name = <local> 'output/rec/FF_1/W', len = 17
shape = <local> (11, 10)
dtype = <local> tf.float32
initializer = <local> <tensorflow.python.ops.init_ops.VarianceScaling object at 0x7f7b20221fd0>
regularizer = <local> None
reuse = <local> <_ReuseMode.AUTO_REUSE: 1>
trainable = <local> True
collections = <local> None
caching_device = <local> None
partitioner = <local> None
validate_shape = <local> True
use_resource = <local> None
custom_getter = <local> <function ReuseParams.get_variable_scope.<locals>._variable_custom_getter at 0x7f7b20217268>
constraint = <local> None
File "/u/beck/programs/python/3.6.1/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 417, in _VariableStore.get_variable
line: return custom_getter(**custom_getter_kwargs)
locals:
custom_getter = <local> <function ReuseParams.get_variable_scope.<locals>._variable_custom_getter at 0x7f7b20217268>
custom_getter_kwargs = <local> {'getter': <function _VariableStore.get_variable.<locals>._true_getter at 0x7f7b202172f0>, 'name': 'output/rec/FF_1/W', 'shape': (11, 10), 'dtype': tf.float32, 'initializer': <tensorflow.python.ops.init_ops.VarianceScaling object at 0x7f7b20221fd0>, 'regularizer': None, 'reuse': <_ReuseMode.AUTO_..., len = 13
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 1801, in ReuseParams.get_variable_scope.<locals>._variable_custom_getter
line: return self.variable_custom_getter(base_layer=base_layer, **kwargs_)
locals:
self = <local> <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:dat...
self.variable_custom_getter = <local> <bound method ReuseParams.variable_custom_getter of <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_typ...
base_layer = <local> <LinearLayer output/'FF_1' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>
kwargs_ = <local> {'getter': <function _VariableStore.get_variable.<locals>._true_getter at 0x7f7b202172f0>, 'name': 'output/rec/FF_1/W', 'shape': (11, 10), 'dtype': tf.float32, 'initializer': <tensorflow.python.ops.init_ops.VarianceScaling object at 0x7f7b20221fd0>, 'regularizer': None, 'reuse': <_ReuseMode.AUTO_..., len = 13
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 1840, in ReuseParams.variable_custom_getter
line: return self.param_map[param_name].variable_custom_getter(
getter=getter, name=name, base_layer=base_layer, **kwargs)
locals:
self = <local> <ReuseParams reuse_layer None, map {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:dat...
self.param_map = <local> {'W': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>, 'b': <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>}
param_name = <local> 'W'
variable_custom_getter = <not found>
getter = <local> <function _VariableStore.get_variable.<locals>._true_getter at 0x7f7b202172f0>
name = <local> 'output/rec/FF_1/W', len = 17
base_layer = <local> <LinearLayer output/'FF_1' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>
kwargs = <local> {'shape': (11, 10), 'dtype': tf.float32, 'initializer': <tensorflow.python.ops.init_ops.VarianceScaling object at 0x7f7b20221fd0>, 'regularizer': None, 'reuse': <_ReuseMode.AUTO_REUSE: 1>, 'trainable': True, 'collections': None, 'caching_device': None, 'partitioner': None, 'validate_shape': True,..., len = 11
File "/u/glushko/setups/switchboard/2021-06-21--ilmt-att-sis/returnn/returnn/tf/layers/base.py", line 1843, in ReuseParams.variable_custom_getter
line: assert param_name in self.reuse_layer.params
locals:
param_name = <local> 'W'
self = <local> <ReuseParams reuse_layer <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>, map None>
self.reuse_layer = <local> <InternalLayer output/'FF_0' out_type=Data(batch_shape_meta=[T|'time:var:extern_data:data',B,F|10])>
self.reuse_layer.params = <local> {}
AssertionError
@albertz you were right about the: "From a first glance, maybe this is because the layer FF_0 is inside the loop and the reuse params logic when it accesses the layer tries to get it from outside." Possible solution for sharing in my case is to use the flag optimize_move_layers_out=0
. But without optimization calculation will be slower.
@albertz you were right about the: "From a first glance, maybe this is because the layer FF_0 is inside the loop and the reuse params logic when it accesses the layer tries to get it from outside." Possible solution for sharing in my case is to use the flag
optimize_move_layers_out=0
. But without optimization calculation will be slower.
optimize_move_layers_out=0
is much slower and consumes more memory, such that I would never recommend to ever use that. Also not in your case. If you want a workaround, try the other workaround I suggested above (using a custom
function).
Using custom function with getting variable using a scope works well for linear layers, but somehow it is not usable for LSTMBlock
.
I use the following function as you mentioned above:
def get_var(name, shape):
from returnn.tf.util.basic import reuse_name_scope
from returnn.tf.compat import v1 as tf
with reuse_name_scope('', absolute=True):
print('Reused variable: ', tf.get_variable(name, shape))
return tf.get_variable(name, shape)
and trying to get share the weights in the following way:
's_shared': { 'class': 'rnn_cell', 'from': ['prev:target_embed'], 'n_out': 1000, 'unit': 'LSTMBlock',
'reuse_params': { 'map': {
'lstm_cell/bias': { 'custom': lambda **_kwargs: get_var('output/rec/s/rec/lstm_cell/bias', _kwargs['shape']),},
'lstm_cell/kernel': { 'custom': lambda **_kwargs: get_var('output/rec/s/rec/lstm_cell/kernel', _kwargs['shape'])}}},},
, where
's': {'class': 'rnn_cell', 'from': ['prev:target_embed'], 'n_out': 1000, 'unit': 'LSTMBlock'},
For linear layers it works in this way:
'readout_in': { 'activation': None, 'class': 'linear', 'from': ['s', 'prev:target_embed', 'att'], 'n_out': 1000, 'with_bias': True},
'readout_in_shared': { 'activation': None, 'class': 'linear', 'from': ['iLMT_s', 'prev:target_embed', 'zero_att'], 'n_out': 1000,
'reuse_params': { 'map': {
'W': { 'custom': lambda **_kwargs: get_var('output/rec/readout_in/W', _kwargs['shape'])},
'b': { 'custom': lambda **_kwargs: get_var('output/rec/readout_in/b', _kwargs['shape'])}}},
'with_bias': True},
(Please properly format using Markdown. I fixed that for you.)
but somehow it is not usable for LSTMBlock
What do you mean by that?
Btw, instead of using LSTMBlock
, you better should use NativeLstm2
.
Also, instead of using rnn_cell
, better use rec
.
I.e.:
's': {'class': 'rec', 'from': 'prev:target_embed', 'n_out': 1000, 'unit': 'NativeLstm2'},
Maybe that already fixes your problems.
I had to mention that RETURNN couldn't find the parameters for sharing. But changing the class from rnn_cell
to 'rec' helped. And with this the parameter to share will be: 'rnn/lstm_cell/bias'
, 'rnn/lstm_cell/kernel'
but somehow it is not usable for LSTMBlock
What is meant by that? Please be more specific. Not usable in what way? What happens? You get the error you described here? Or sth else?
But changing the class from
rnn_cell
torec
helped.
What does that mean? Helped how? It works then? Or what? Please be more specific.
In any case, the original bug here is fixed now, via #695.
It would be nice if you could clarify the other things I asked about.
If there are other further problems, please open a new issue.
There is no problem when using
reuse_params
layer flag with only one softmax layer with loss, like here:but i can't manage initialization of layers in a such case:
I need the second output 'output1' to train a sub-decoder part. This is the config I'm using, the second decoder has prefix "iLMT_".
/work/asr3/zeineldeen/hiwis/glushko/setups-data/switchboard/2021-06-21--ilmt-att-sis/work/crnn/training/CRNNTrainingJob.ACreMrgOrckx/output/crnn.config