Closed phecda-xu closed 4 years ago
ok,
------------------ 原始邮件 ------------------ 发件人: "Z-yq/TensorflowASR" <notifications@github.com>; 发送时间: 2020年10月5日(星期一) 下午2:55 收件人: "Z-yq/TensorflowASR"<TensorflowASR@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [Z-yq/TensorflowASR] 随机出现“generator”取数据报错及处理 (#5)
你好:
有一个小小的疑问,
在CPU上训练,linux 16.04,使用aishell_1中的几个人的数据(2100条音频,验证代码用);训练 ConformerTransducer, 其它参数默认。
020-10-05 10:28:11,241 - root - INFO - trainer resume failed020-10-05 10:28:11,241 - root - INFO - trainer resume failed [Train] [Epoch 1/2] | | 7/2096 [00:36<2:07:57, 3.68s/batch, transducer_loss=373.089] WARNING:tensorflow:5 out of the last 6 calls to <function MultiHeadAttention.call at 0x7f911405de60> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details. 2020-10-05 10:28:47,972 - tensorflow - WARNING - 5 out of the last 6 calls to <function MultiHeadAttention.call at 0x7f911405de60> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details. WARNING:tensorflow:5 out of the last 6 calls to <function MultiHeadAttention.call at 0x7f910c7b83b0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details. 2020-10-05 10:28:48,185 - tensorflow - WARNING - 5 out of the last 6 calls to <function MultiHeadAttention.call at 0x7f910c7b83b0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details. WARNING:tensorflow:5 out of the last 6 calls to <function MultiHeadAttention.call at 0x7f910c7648c0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings is likely due to passing python objects instead of tensors. Also, tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. Please refer to https://www.tensorflow.org/tutorials/customization/performance#python_or_tensor_args and https://www.tensorflow.org/api_docs/python/tf/function for more details. ... [Train] [Epoch 1/2] |████▊ | 500/2096 [09:30<26:07, 1.02batch/s, Successfully Saved Checkpoint] ... [Train] [Epoch 1/2] |█████▏ | 547/2096 [10:39<23:14, 1.11batch/s, transducer_loss=85.972] ... ValueError: generator
yielded an element of shape (0,) where an element of shape (None, None, 80, 1) was expected.
第547步出现报错,但是报错并不是只出现在某个固定的步数,是随机出现的。
经过对内部数据出里过程的了解,我发现你在数据处理脚本中做了如下的过滤处理:
if len(data) < 400: continue elif len(data) > self.speech_featurizer.sample_rate * 7: continue
也就是说当音频数据(16K采样)时长小于25ms以及大于7s的时候,丢弃。当一个batch的所有音频数据时长都大于7s时,全丢弃,generator就生成None,也就造成上述的错误。
解决方法也很简单,把数字7改大一点就行。
那么问题来了,小于25ms的数据丢弃我可以理解,那大于7s 的也丢弃是为什么呢,超过7s会造成模型识别效果变差所以不用的吗?
你在处理AISHELL2数据集的时候是把所有大于7s的音频都丢弃不用吗?
此外,tensorflow - WARNING部分是什么情况,没看明白?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
秒以后的长度是为了防止GPU OOM
明白,谢谢!
你好:
有一个小小的疑问,
在CPU上训练,linux 16.04,使用aishell_1中的几个人的数据(2100条音频,验证代码用);训练 ConformerTransducer, 其它参数默认。
第547步出现报错,但是报错并不是只出现在某个固定的步数,是随机出现的。
经过对内部数据出里过程的了解,我发现你在数据处理脚本中做了如下的过滤处理:
也就是说当音频数据(16K采样)时长小于25ms以及大于7s的时候,丢弃。当一个batch的所有音频数据时长都大于7s时,全丢弃,generator就生成
None
,也就造成上述的错误。解决方法也很简单,把数字7改大一点就行。
那么问题来了,小于25ms的数据丢弃我可以理解,那大于7s 的也丢弃是为什么呢,超过7s会造成模型识别效果变差所以不用的吗?
你在处理AISHELL2数据集的时候是把所有大于7s的音频都丢弃不用吗?
此外,
tensorflow - WARNING
部分是什么情况,没看明白?