google-research / google-research

Google Research
https://research.google
Apache License 2.0
34.3k stars 7.92k forks source link

[kws_streaming] How to apply Delay layer in bc_resnet? #1862

Open Koowater opened 11 months ago

Koowater commented 11 months ago

I'm trying to apply Delay layer in bc_resnet for streaming in 'same' padding. Because I got an error when I used 'causal' padding to bc_resnet...

I'm wondering how to apply Delay layer to bc_resnet. Delay layer needs a delay_val, and delay_val is calculated using kernel_size and dilation. But bc_resnet's kernel_size and dilation is not integer, these are tuple.

How to apply Delay layer in bc_resnet?

(Sorry, I'm seeing #1425 but I need more definite answer...)

rybakov commented 11 months ago

I fixed bc_resnet with 'causal' padding and added a test with 'bc_resnet_causal'.

You could apply Delay layer to bc_resnet the same way it is done in delay_test.py has several examples combining conv with delay layers.

It can be easier to re-design it using sub class api as shown in example

Koowater commented 11 months ago

Thank you for answering my issue, @rybakov.

I'm trying to apply causal padding and Delay layer for residual and identity connection. But I got an error when I convert my model to tflite_streaming_model.

I'll try to find out where I'm going wrong, but I don't know how to fix it at the moment. Could you please review my errors?

WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 2
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 2
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 4
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 4
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 4
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(1, 16000)]              0         

 speech_features (SpeechFeat  (1, 98, 40)              0         
 ures)                                                           

 tf_op_layer_ExpandDims (Ten  [(1, 98, 40, 1)]         0         
 sorFlowOpLayer)                                                 

 stream (Stream)             (1, 98, 20, 16)           416       

 transition_block (Transitio  (1, 98, 20, 8)           464       
 nBlock)                                                         

 normal_block (NormalBlock)  (1, 98, 20, 8)            304       

 normal_block_1 (NormalBlock  (1, 98, 20, 8)           304       
 )                                                               

 transition_block_1 (Transit  (1, 98, 10, 12)          648       
 ionBlock)                                                       

 normal_block_2 (NormalBlock  (1, 98, 10, 12)          504       
 )                                                               

 normal_block_3 (NormalBlock  (1, 98, 10, 12)          504       
 )                                                               

 transition_block_2 (Transit  (1, 98, 5, 16)           992       
 ionBlock)                                                       

 normal_block_4 (NormalBlock  (1, 98, 5, 16)           736       
 )                                                               

 normal_block_5 (NormalBlock  (1, 98, 5, 16)           736       
 )                                                               

 normal_block_6 (NormalBlock  (1, 98, 5, 16)           736       
 )                                                               

 normal_block_7 (NormalBlock  (1, 98, 5, 16)           736       
 )                                                               

 transition_block_3 (Transit  (1, 98, 5, 20)           1400      
 ionBlock)                                                       

 normal_block_8 (NormalBlock  (1, 98, 5, 20)           1000      
 )                                                               

 normal_block_9 (NormalBlock  (1, 98, 5, 20)           1000      
 )                                                               

 normal_block_10 (NormalBloc  (1, 98, 5, 20)           1000      
 k)                                                              

 normal_block_11 (NormalBloc  (1, 98, 5, 20)           1000      
 k)                                                              

 stream_33 (Stream)          (1, 98, 5, 20)            520       

 tf_op_layer_Mean (TensorFlo  [(1, 98, 1, 20)]         0         
 wOpLayer)                                                       

 conv2d_21 (Conv2D)          (1, 98, 1, 32)            640       

 stream_34 (Stream)          (1, 1, 1, 32)             0         

 conv2d_22 (Conv2D)          (1, 1, 1, 12)             384       

 tf_op_layer_Squeeze (Tensor  [(1, 12)]                0         
 FlowOpLayer)                                                    

=================================================================
Total params: 14,024
Trainable params: 11,032
Non-trainable params: 2,992
_________________________________________________________________

WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 5
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 2
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 2
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 4
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 4
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 8
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 16
WARNING:absl:ring_buffer_size_in_time_dim overwritten by the passed-in value: 5
Traceback (most recent call last):
  File "decode.py", line 196, in <module>
    tflite_streaming_model = utils.model_to_tflite(sess, model_non_stream_batch, flags, Modes.STREAM_EXTERNAL_STATE_INFERENCE)
  File "/tf/kws/koowater/builder/kws_streaming/models/utils.py", line 386, in model_to_tflite
    model_stream = to_streaming_inference(model_non_stream, flags, mode)
  File "/tf/kws/koowater/builder/kws_streaming/models/utils.py", line 318, in to_streaming_inference
    model_inference = convert_to_inference_model(model_non_stream,
  File "/tf/kws/koowater/builder/kws_streaming/models/utils.py", line 249, in convert_to_inference_model
    new_model = _clone_model(model, input_tensors)
  File "/tf/kws/koowater/builder/kws_streaming/models/utils.py", line 109, in _clone_model
    functional.reconstruct_from_config(
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/functional.py", line 1495, in reconstruct_from_config
    if process_node(layer, node_data):
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/functional.py", line 1435, in process_node
    output_tensors = layer(input_tensors, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/base_layer_v1.py", line 838, in __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 692, in wrapper
    raise e.ag_error_metadata.to_exception(e)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 689, in wrapper
    return converted_call(f, args, kwargs, options=options)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 439, in converted_call
    result = converted_f(*effective_args, **kwargs)
  File "/tmp/__autograph_generated_fileizg6by4c.py", line 105, in tf__call
    ag__.if_stmt((ag__.ld(self).mode == ag__.ld(modes).Modes.STREAM_INTERNAL_STATE_INFERENCE), if_body_4, else_body_4, get_state_4, set_state_4, ('do_return', 'retval_', 'self.output_state'), 3)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1266, in if_stmt
    _py_if_stmt(cond, body, orelse)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1319, in _py_if_stmt
    return body() if cond else orelse()
  File "/tmp/__autograph_generated_fileizg6by4c.py", line 103, in else_body_4
    ag__.if_stmt((ag__.ld(self).mode == ag__.ld(modes).Modes.STREAM_EXTERNAL_STATE_INFERENCE), if_body_3, else_body_3, get_state_3, set_state_3, ('do_return', 'retval_', 'self.output_state'), 3)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1266, in if_stmt
    _py_if_stmt(cond, body, orelse)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1319, in _py_if_stmt
    return body() if cond else orelse()
  File "/tmp/__autograph_generated_fileizg6by4c.py", line 71, in if_body_3
    ag__.if_stmt(ag__.ld(self).ring_buffer_size_in_time_dim, if_body_1, else_body_1, get_state_1, set_state_1, ('output', 'self.output_state'), 2)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1266, in if_stmt
    _py_if_stmt(cond, body, orelse)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1319, in _py_if_stmt
    return body() if cond else orelse()
  File "/tmp/__autograph_generated_fileizg6by4c.py", line 65, in if_body_1
    (output, ag__.ld(self).output_state) = ag__.converted_call(ag__.ld(self)._streaming_external_state, (ag__.ld(inputs), ag__.ld(self).input_state), None, fscope)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 441, in converted_call
    result = converted_f(*effective_args)
  File "/tmp/__autograph_generated_file_lmun91g.py", line 166, in tf___streaming_external_state
    ag__.if_stmt(ag__.converted_call(ag__.ld(isinstance), (ag__.converted_call(ag__.ld(self).get_core_layer, (), None, fscope), ag__.ld(tf).keras.layers.Conv2DTranspose), None, fscope), if_body_6, else_body_6, get_state_6, set_state_6, ('do_return', 'retval_'), 2)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1266, in if_stmt
    _py_if_stmt(cond, body, orelse)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1319, in _py_if_stmt
    return body() if cond else orelse()
  File "/tmp/__autograph_generated_file_lmun91g.py", line 157, in else_body_6
    ag__.if_stmt(ag__.ld(self).use_one_step, if_body_5, else_body_5, get_state_5, set_state_5, ('do_return', 'retval_'), 2)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1266, in if_stmt
    _py_if_stmt(cond, body, orelse)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/operators/control_flow.py", line 1319, in _py_if_stmt
    return body() if cond else orelse()
  File "/tmp/__autograph_generated_file_lmun91g.py", line 134, in if_body_5
    memory = ag__.converted_call(ag__.ld(tf).keras.backend.concatenate, ([ag__.ld(memory), ag__.ld(inputs)], 1), None, fscope)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 331, in converted_call
    return _call_unconverted(f, args, kwargs, options, False)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/autograph/impl/api.py", line 459, in _call_unconverted
    return f(*args)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/usr/local/lib/python3.8/dist-packages/keras/backend.py", line 3581, in concatenate
    return tf.concat([to_dense(x) for x in tensors], axis)
ValueError: in user code:

    File "/tf/kws/koowater/builder/kws_streaming/layers/stream.py", line 411, in call  *
        output, self.output_state = self._streaming_external_state(
    File "/tf/kws/koowater/builder/kws_streaming/layers/stream.py", line 563, in _streaming_external_state  *
        memory = tf.keras.backend.concatenate([memory, inputs], 1)
    File "/usr/local/lib/python3.8/dist-packages/keras/backend.py", line 3581, in concatenate
        return tf.concat([to_dense(x) for x in tensors], axis)

    ValueError: Dimension 0 in both shapes must be equal, but are 40 and 43. Shapes are [40,1] and [43,1]. for '{{node streaming/stream/concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](streaming/stream/strided_slice, streaming/stream/Pad, streaming/stream/concat/axis)' with input shapes: [1,4,40,1], [1,1,43,1], [] and with computed input tensors: input[2] = <1>.
Koowater commented 11 months ago

I used Modes.STREAM_EXTERNAL_STATE_INFERENCE and I found that my error is raised in first Conv2D layer after expand_dim. frequency_pad in Stream layer occur frequency dimension mismatch. In training, this frequency dimension was applied mean layer, so this is not important, but in STREAM_EXTERNAL_STATE_INFERENCE this mismatch occur error. This issue may be my fault, please review this issue.

[+] I changed inference mode to STREAM_INTERNAL_STATE_INFERENCE and conversion is works. But I could not believe my model works well because streaming model's output and non streaming model's output is different...

rybakov commented 11 months ago

Please confirm that you pulled the latest version of kws_streaming

Koowater commented 11 months ago

@rybakov Sorry, and thank you so much! I pulled the latest version of kws_streaming and bc_resnet was converted to streaming inference well.

But I have a last question, model's prediction is difference between streaming inference and non streaming inference. I want to know bc_resnet's input data_shape at streaming inference. I used data_shape = (160,).

Here's my output.

*** Here is streaming inference ***
[silence]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[silence]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[silence]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[silence]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[silence]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
{1: '_unknown_', 2: 'yes', 4: 'up', 10: 'stop', 3: 'no', 11: 'go', 5: 'down', 9: 'off', 8: 'on', 6: 'left', 7: 'right', 0: '_silence_'}
../datasets/data2/yes/1b88bf70_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/05b2db80_nohash_1.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 0, 0, 0, 0]
../datasets/data2/yes/b66f4f93_nohash_5.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
../datasets/data2/yes/750e3e75_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 11, 11, 11, 11, 11, 11, 11, 11, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/e49428d9_nohash_3.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/af7a8296_nohash_3.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9]
../datasets/data2/yes/778a4a01_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 6, 6, 6, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/b00dff7e_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
../datasets/data2/yes/e77d88fc_nohash_1.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[6, 6, 6, 6, 6, 6, 6, 6, 6, 9, 9, 9, 9, 9, 9, 9, 9, 9, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
../datasets/data2/yes/0cb74144_nohash_2.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 5, 5, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/3d794813_nohash_4.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/11321027_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
../datasets/data2/yes/06f6c194_nohash_4.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 0, 0, 0, 11, 11, 11, 11, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
../datasets/data2/yes/b97c9f77_nohash_1.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/7213ed54_nohash_4.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/c50f55b8_nohash_5.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 2, 2, 2, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/yes/09bcdc9d_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 6, 6, 6, 6, 6, 6, 6, 6, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
../datasets/data2/yes/cae62f38_nohash_1.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
../datasets/data2/yes/1942abd7_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 11, 11, 11, 11, 11, 11, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
../datasets/data2/yes/321aba74_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 11, 11, 11, 11, 11, 11, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 2, 2, 2, 2, 2, 2, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/no/b66f4f93_nohash_5.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
../datasets/data2/no/66cbe2b3_nohash_2.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 9, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 4, 4, 4, 4, 4, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/no/750e3e75_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 4, 4, 4, 4, 4, 4, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 11, 11, 4, 4, 4, 4, 11, 11, 11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
../datasets/data2/no/3852fca2_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 11, 11, 11, 11, 11, 11, 11, 11, 6, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/no/e49428d9_nohash_3.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 4, 4, 4, 4, 4, 4, 4, 4, 11, 11, 11, 11, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
../datasets/data2/no/778a4a01_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/no/b00dff7e_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 4, 4, 4, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
../datasets/data2/no/61e50f62_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
../datasets/data2/no/e77d88fc_nohash_1.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 9, 9, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]
../datasets/data2/no/0cb74144_nohash_2.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 11, 11, 11, 11, 11, 11, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
../datasets/data2/no/17c94b23_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
../datasets/data2/no/3d794813_nohash_4.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9]
../datasets/data2/no/e55a2b20_nohash_1.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9]
../datasets/data2/no/06f6c194_nohash_4.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9]
../datasets/data2/no/b97c9f77_nohash_1.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
../datasets/data2/no/7213ed54_nohash_4.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
../datasets/data2/no/c50f55b8_nohash_5.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
../datasets/data2/no/09bcdc9d_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 4, 4, 4, 4, 4, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 9, 9, 9, 9, 9, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
../datasets/data2/no/1942abd7_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
../datasets/data2/no/321aba74_nohash_0.wav
[input_data.shape] (1, 16000) [out_tflite.shape] (100, 12)
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 11, 11, 11, 11, 11, 11, 11, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
*** Here is non streaming inference ***
yes
../datasets/data2/yes/05b2db80_nohash_1.wav
yes
../datasets/data2/yes/b66f4f93_nohash_5.wav
yes
../datasets/data2/yes/750e3e75_nohash_0.wav
yes
../datasets/data2/yes/e49428d9_nohash_3.wav
yes
../datasets/data2/yes/af7a8296_nohash_3.wav
yes
../datasets/data2/yes/778a4a01_nohash_0.wav
yes
../datasets/data2/yes/b00dff7e_nohash_0.wav
yes
../datasets/data2/yes/e77d88fc_nohash_1.wav
yes
../datasets/data2/yes/0cb74144_nohash_2.wav
yes
../datasets/data2/yes/3d794813_nohash_4.wav
yes
../datasets/data2/yes/11321027_nohash_0.wav
yes
../datasets/data2/yes/06f6c194_nohash_4.wav
yes
../datasets/data2/yes/b97c9f77_nohash_1.wav
yes
../datasets/data2/yes/7213ed54_nohash_4.wav
yes
../datasets/data2/yes/c50f55b8_nohash_5.wav
yes
../datasets/data2/yes/09bcdc9d_nohash_0.wav
yes
../datasets/data2/yes/cae62f38_nohash_1.wav
yes
../datasets/data2/yes/1942abd7_nohash_0.wav
yes
../datasets/data2/yes/321aba74_nohash_0.wav
yes
../datasets/data2/no/b66f4f93_nohash_5.wav
no
../datasets/data2/no/66cbe2b3_nohash_2.wav
no
../datasets/data2/no/750e3e75_nohash_0.wav
no
../datasets/data2/no/3852fca2_nohash_0.wav
no
../datasets/data2/no/e49428d9_nohash_3.wav
no
../datasets/data2/no/778a4a01_nohash_0.wav
no
../datasets/data2/no/b00dff7e_nohash_0.wav
no
../datasets/data2/no/61e50f62_nohash_0.wav
no
../datasets/data2/no/e77d88fc_nohash_1.wav
no
../datasets/data2/no/0cb74144_nohash_2.wav
no
../datasets/data2/no/17c94b23_nohash_0.wav
no
../datasets/data2/no/3d794813_nohash_4.wav
no
../datasets/data2/no/e55a2b20_nohash_1.wav
no
../datasets/data2/no/06f6c194_nohash_4.wav
no
../datasets/data2/no/b97c9f77_nohash_1.wav
no
../datasets/data2/no/7213ed54_nohash_4.wav
no
../datasets/data2/no/c50f55b8_nohash_5.wav
no
../datasets/data2/no/09bcdc9d_nohash_0.wav
no
../datasets/data2/no/1942abd7_nohash_0.wav
no
../datasets/data2/no/321aba74_nohash_0.wav
no
krilina988 commented 4 months ago

Hello, how did you modify bc_resnet to enable streaming inference? I refer to delay_test and the modification always fails. I don't know what went wrong. @Koowater @rybakov