Closed ezerhouni closed 11 months ago
FYI, it seems that the pre-trained model is having the same issue :
terminate called after throwing an instance of 'std::runtime_error'
what(): The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/__torch__.py", line 41, in forward
encoder_states = torch.slice(states, None, -2)
encoder = self.encoder
_8 = (encoder).streaming_forward(x0, x_lens, encoder_states, src_key_padding_mask0, )
~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
encoder_out, encoder_out_lens, new_encoder_states, = _8
encoder_out0 = torch.permute(encoder_out, [1, 0, 2])
File "code/__torch__/zipformer.py", line 434, in streaming_forward
_108 = torch.floordiv((left_context_frames)[0], ds)
_109 = torch.slice(src_key_padding_mask, -1, None, None, ds)
_110 = (_0).streaming_forward(x14, _107, _108, _109, )
~~~~~~~~~~~~~~~~~~~~~ <--- HERE
x15, new_layer_states, = _110
layer_offset = torch.add(0, num_layers)
File "code/__torch__/zipformer.py", line 563, in streaming_forward
_1 = getattr(layers, "1")
cached_key, cached_nonlin_attn, cached_val1, cached_val2, cached_conv1, cached_conv2, = torch.slice(states, 0, 6)
_148 = (_0).streaming_forward(src, pos_emb, cached_key, cached_nonlin_attn, cached_val1, cached_val2, cached_conv1, cached_conv2, left_context_len, src_key_padding_mask, )
~~~~~~~~~~~~~~~~~~~~~ <--- HERE
output, new_cached_key, new_cached_nonlin_attn, new_cached_val1, new_cached_val2, new_cached_conv1, new_cached_conv2, = _148
_149 = [new_cached_key, new_cached_nonlin_attn, new_cached_val1, new_cached_val2, new_cached_conv1, new_cached_conv2]
File "code/__torch__/zipformer.py", line 794, in streaming_forward
conv_module1 = self.conv_module1
_198 = torch.slice(torch.slice(src_key_padding_mask), 1, left_context_len)
_199 = (conv_module1).streaming_forward(src15, cached_conv1, _198, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
src_conv, cached_conv16, = _199
src16 = torch.add(src15, src_conv)
Traceback of TorchScript, original code (most recent call last):
File "./zipformer/export.py", line 289, in forward
encoder_out_lens,
new_encoder_states,
) = self.encoder.streaming_forward(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
x=x,
x_lens=x_lens,
File "/ceph-zw/workspace/zipformer/icefall_zipformer/egs/librispeech/ASR/zipformer/zipformer.py", line 441, in streaming_forward
x = convert_num_channels(x, self.encoder_dim[i])
x, new_layer_states = module.streaming_forward(
~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
x,
states=states[layer_offset * 6 : (layer_offset + num_layers) * 6],
File "/ceph-zw/workspace/zipformer/icefall_zipformer/egs/librispeech/ASR/zipformer/zipformer.py", line 1026, in streaming_forward
new_cached_conv1,
new_cached_conv2
) = mod.streaming_forward(
~~~~~~~~~~~~~~~~~~~~~ <--- HERE
output,
pos_emb,
File "/ceph-zw/workspace/zipformer/icefall_zipformer/egs/librispeech/ASR/zipformer/zipformer.py", line 856, in streaming_forward
src = src + self_attn
src_conv, cached_conv1 = self.conv_module1.streaming_forward(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
src,
cache=cached_conv1,
RuntimeError: vector::_M_range_check: __n (which is 18446744073709551615) >= this->size() (which is 3)
Are you using the latest code?
I think it has been fixed in https://github.com/k2-fsa/icefall/pull/1131
Thank you ! I will try it out and let you know. Also, I am seeing other place with chunk(.., dim=-1)
should we take care of it ? If so I can create a PR
Thank you ! I will try it out and let you know. Also, I am seeing other place with
chunk(.., dim=-1)
should we take care of it ? If so I can create a PR
Yes, please do it.
Hello !
We are trying to run a zipformer2 model trained with our data on icefall. However we are getting the following error:
Any idea where it might comes from ? We are using pytorch 2.0.0 (planning to upgrade to 2.0.1 or 2.1 to see if the bug is still there)
Thank you very much !