语音转文字识别率低

MentosL commented 1 year ago

环境： Windows 10 专业版

问题：安装环境之后，使用example中存在的例子和个人素材进行demo：

example ：

个人素材也是同样识别出第一个音，后面就没有了。

目的：想请教大佬们，目前转化的准确率是存在问题，后面能进一步提高嘛？

minskiter commented 1 year ago

模型没有加载成功，使用macos也是出现了此情况，提示无法打开模型——DATA_LOSS。由于个人没有用过tensorflow，不知道其原因是否是因为默认使用的GPU，但是这里仅能只能CPU导致的呢？

minskiter commented 1 year ago

模型没有加载成功，使用macos也是出现了此情况，提示无法打开模型——DATA_LOSS。由于个人没有用过tensorflow，不知道其原因是否是因为默认使用的GPU，但是这里仅能只能CPU导致的呢？

2023-06-17 17:57:08.900101: W tensorflow/core/util/tensor_slice_reader.cc:97] Could not open /Users/minskiter/miniconda3/envs/voice/lib/python3.8/site-packages/parrots/data/speech_model/speech_recognition.model: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
2023-06-17 17:57:08.919118: W tensorflow/core/util/tensor_slice_reader.cc:97] Could not open /Users/minskiter/miniconda3/envs/voice/lib/python3.8/site-packages/parrots/data/speech_model/speech_recognition.model.base: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

AlucardNosferatu commented 1 year ago

把.model文件改成.h5，.base的那个改成.base.h5，就可以正常加载了（至少没报错），但是结果是一样的，只有最开始两个拼音识别成功

AlucardNosferatu commented 1 year ago

在作者提供的在线体验demo上效果也是一样的，至少和pip里用的是同一个模型（ https://www.mulanai.com/product/asr/#trial

AlucardNosferatu commented 1 year ago

看predict里面把data塞进x_in里只占了很小一部分（349/1600），其余都是0，而且predict用的模型还是base_model，不是_model，不确定和这个是否相关

shibing624 / parrots

语音转文字识别率低 #21