关于语音识别功能的使用

ShowUNow commented 1 year ago

前置确认

网络能够访问openai接口 #351
python 已安装：版本在 3.7 ~ 3.10 之间，依赖已安装
在已有 issue 中未搜索到类似问题
FAQS 中无类似问题

问题描述

"voice_reply_voice": true这个语句没在config-template.json里找到，不过应该是添加到chatgpt-on-wechat/config.json里面吧？...

对应语音合成平台的key是阿里云的

终端日志 (如有报错)

环境

操作系统类型 (Mac/Windows/Linux)：Windows 10
Python版本 ( 执行 python3 -V )：python 3.9
pip版本 ( 依赖问题此项必填，执行 pip3 -V)：pip 23.0.1

qcoltma commented 1 year ago

我安装要求配置了baidu的，为什么还是去调用google呢？

[INFO][2023-04-05 08:55:12][bridge.py:28] - create bot google for text_to_voice [ERROR][2023-04-05 08:55:12][chat_channel.py:233] - Worker return exception: No module named 'speech_recognition' Traceback (most recent call last):

Xuj69 commented 1 year ago

你去bridge文件夹中找到bridge.py，把代码中"text_to_voice": conf().get("text_to_voice", "Google")改成"text_to_voice": conf().get("text_to_voice", "baidu")

qcoltma commented 1 year ago

鉴于openai调用效率太低，能否把语音转文本设置成用baidu来完成，然后回复还是用文本回复（我不想同时开启语音回复语音）。请问这个怎么设置？

lanvent commented 1 year ago

参考可选配置项

qcoltma commented 1 year ago

参考可选配置项

"voice_to_text": "openai", # 语音识别引擎，支持openai,google,azure 好像不支持:(

lanvent commented 1 year ago

忘记修改了，是支持百度的

qcoltma commented 1 year ago

忘记修改了，是支持百度的配置了baidu后，出现如下错误： [ERROR][2023-04-11 20:26:34][chat_channel.py:237] - Worker return exception: file does not start with RIFF id Traceback (most recent call last): File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run result = self.fn(*self.args, *self.kwargs) File "/home/qcolt/chatgpt-on-wechat/channel/chat_channel.py", line 128, in _handle reply = self._generate_reply(context) File "/home/qcolt/chatgpt-on-wechat/channel/chat_channel.py", line 138, in _generate_reply e_context = PluginManager().emit_event(EventContext(Event.ON_HANDLE_CONTEXT, { File "/home/qcolt/chatgpt-on-wechat/plugins/plugin_manager.py", line 159, in emit_event instance.handlers[e_context.event](e_context, args, **kwargs) File "/home/qcolt/chatgpt-on-wechat/plugins/skills/skills.py", line 56, in on_handle_context reply = e_context['channel'].build_voice_to_text(file_name) File "/home/qcolt/chatgpt-on-wechat/channel/channel.py", line 38, in build_voice_to_text return Bridge().fetch_voice_to_text(voice_file) File "/home/qcolt/chatgpt-on-wechat/bridge/bridge.py", line 46, in fetch_voice_to_text return self.get_bot("voice_to_text").voiceToText(voiceFile) File "/home/qcolt/chatgpt-on-wechat/voice/baidu/baidu_voice.py", line 65, in voiceToText pcm = get_pcm_from_wav(voice_file) File "/home/qcolt/chatgpt-on-wechat/voice/audio_convert.py", line 29, in get_pcm_from_wav wav = wave.open(wav_path, "rb") File "/usr/lib/python3.8/wave.py", line 510, in open return Wave_read(f) File "/usr/lib/python3.8/wave.py", line 164, in init self.initfp(f) File "/usr/lib/python3.8/wave.py", line 131, in initfp raise Error('file does not start with RIFF id') wave.Error: file does not start with RIFF id

lanvent commented 1 year ago

估计你没装ffmpeg

qcoltma commented 1 year ago

估计你没装ffmpeg

确实是，现在搞定了，谢谢。

qcoltma commented 1 year ago

估计你没装ffmpeg

奇怪，我在另一台机器上，安装了ffmpeg后还是出现这个错误，还有没有别的原因了？

Winston-H commented 1 year ago

我都设置为调用百度的API 为什么还需要安装ffmpeg这个呢

Super-Wong commented 1 year ago

[Google] textToVoice text=“你好嗎？”的英文翻译是 "How are you?"。 voice file name=tmp/reply-1682526485-1578620872.mp3 [ERROR][2023-04-26 19:28:06][wechatmp_channel.py:98] - [wechatmp] upload voice failed: Error code: 48001, message: api unauthorized rid: 64495116-496f1ca5-6817ed93

请教这个是什么问题啊？

lanvent commented 1 year ago

是#948 这种情况吗

Super-Wong commented 1 year ago

是#948 这种情况吗

应该是这个情况。我吧公众号认证一下试试

caizhenjia commented 1 year ago

怎么重新生成二维码

andyzlys commented 1 year ago

我在google 的cloud run上部署了这个项目来对接企业微信的自建应用，文本已经可以正常接发了。但是我设置"speech_recognition": true, 然后发送语音，按说明，应该是会调用默认的openai来识别语音并回复文字，但收到的回复是： [ERROR] Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']

请问要如何解决呢？十分感谢

lanvent commented 1 year ago

企业应用号需要ffmpeg，并安装amr编码器

lsCoding666 commented 1 year ago

语音合成失败我打印下被合成的文本发现没有问题。就是让百度去合成的时候报错，错误代码513。百度技术文档没有这个错误代码

[INFO][2023-05-06 21:48:43][wechaty_channel.py:129] - [WX] receiveMsg=ChatMessage: id=3431040645055321713, create_time=1683380918, ctype=VOICE, content=tmp/message-3431040645055321713-audio.slk, from_user_id=wxid_k10fvreqzcbf22, from_user_nickname=梁爽, to_user_id=18725660724@chatroom, to_user_nickname=账号群, other_user_id=18725660724@chatroom, other_user_nickname=账号群, is_group=True, is_at=False, actual_user_id=wxid_k10fvreqzcbf22, actual_user_nickname=梁爽, context=Context(type=VOICE, content=tmp/message-3431040645055321713-audio.slk, kwargs={'isgroup': True, 'msg': <channel.wechat.wechaty_message.WechatyMessage object at 0x00000218C8C58CD0>, 'origin_ctype': <ContextType.VOICE: 2>, 'openai_api_key': None, 'session_id': 'wxid_k10fvreqzcbf22', 'receiver': '18725660724@chatroom', 'desire_rtype': <ReplyType.VOICE: 2>})
[INFO][2023-05-06 21:48:46][openai_voice.py:25] - [Openai] voiceToText text=機器人介紹下三體的劇情 voice file name=tmp/message-3431040645055321713-audio.wav
[INFO][2023-05-06 21:48:46][chat_gpt_bot.py:49] - [CHATGPT] query=介紹下三體的劇情
[INFO][2023-05-06 21:49:33][baidu_voice.py:84] - 被合成的文本=《三体》是刘慈欣的科幻小说，是中国近年来受欢迎的科幻小说之一，曾荣获雨果奖。

《三体》的故事主要围绕一个名为“三体”的外星文明展开，它们居住于一个蕴含三个恒星的行星系统中，常年受到从另一恒星系中发射的恒星光束以及混沌环境的影响，为了生存而不断进化。三体文明历经了多次灾难性的文明崩溃，一次发现地球之后，他们发动了一场推翻人类文明的战争。

小说第一部主要描述了以世界末日之夜（又称“红岸基事件”）为开始的故事。红岸基事件是一场怪异的学术研讨会，它的主题是“三体问 题”的解法。会议的参与者们各有不同的想法，有的认为解法在纯理论上就不可能存在，有的则认为应该从实践中寻找。

在这个过程中，人类与三体间的交流不断加深，并且三体文明对地球的侵略也日益临近。小说第二部主要描述了三体文明的来袭以及人类如何应对。在全世界政府和科技界的努力下，人类共同应对了三体文明的入侵，利用一种特殊的技术，人类成功进入了三体文明的行星系，并在那里发现了一个名为“黑暗森林”的宇宙规律。

小说第三部主要是对人类文明和三体文明之间的较量的终结。在第三部，三体文明与地球文明之间的较量进一步升级，两个文明之间的历史和复杂的关系得以揭示，最终达成了一个复杂的结局。
[ERROR][2023-05-06 21:49:33][baidu_voice.py:93] - [Baidu] textToVoice error={'cookie': '549501194_2000', 'err_detail': 'Invalid text length', 'err_msg': 'tex param err', 'err_no': 513, 'err_subcode': 234, 'tts_logid': 1888171743}
[INFO][2023-05-06 21:49:34][wechaty_channel.py:75] - [WX] sendMsg=Reply(type=ERROR, content=[ERROR]

lsCoding666 commented 1 year ago

https://github.com/zhayujie/chatgpt-on-wechat/issues/1037

andyzlys commented 1 year ago

按您说的装上后就可用了

kuaile1993 commented 1 year ago

112.53.2.93:35534 - - [09/May/2023 06:21:57] "HTTP/1.1 POST /wxcomapp" - 200 OK [ERROR][2023-05-09 06:21:57][chat_channel.py:267] - Worker return exception: 'BaiduVoice' object has no attribute 'client' Traceback (most recent call last): File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/app/channel/chat_channel.py", line 145, in _handle reply = self._generate_reply(context) File "/app/channel/chat_channel.py", line 177, in _generate_reply reply = super().build_voice_to_text(wav_path) File "/app/channel/channel.py", line 40, in build_voice_to_text return Bridge().fetch_voice_to_text(voice_file) File "/app/bridge/bridge.py", line 48, in fetch_voice_to_text return self.get_bot("voice_to_text").voiceToText(voiceFile) File "/app/voice/baidu/baidu_voice.py", line 65, in voiceToText res = self.client.asr(pcm, "pcm", 16000, {"dev_pid": self.dev_id}) AttributeError: 'BaiduVoice' object has no attribute 'client' 这个报错是什么原因呢？好像我第一个遇到？

X-233 commented 1 year ago

[WARNING][2023-05-10 17:05:34][chat_channel.py:174] - [WX]any to wav error, use raw path. name 'any_to_wav' is not defined [INFO][2023-05-10 17:05:34][bridge.py:30] - create bot openai for voice_to_text，这个怎么做呀

1728012088 commented 1 year ago

[WARNING][2023-05-17 19:40:55][chat_channel.py:175] - [WX]any to wav error, use raw path. name 'any_to_wav' is not defined [INFO][2023-05-17 19:40:55][bridge.py:30] - create bot baidu for voice_to_text [ERROR][2023-05-17 19:40:55][chat_channel.py:268] - Worker return exception: No module named 'pysilk.backends.cython._silk' Traceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38-32\lib\concurrent\futures\thread.py", line 57, in run result = self.fn(*self.args, *self.kwargs) File "C:\Users\Administrator\chatgpt-on-wechat\channel\chat_channel.py", line 146, in _handle reply = self._generate_reply(context) File "C:\Users\Administrator\chatgpt-on-wechat\channel\chat_channel.py", line 178, in _generate_reply reply = super().build_voice_to_text(wav_path) File "C:\Users\Administrator\chatgpt-on-wechat\channel\channel.py", line 40, in build_voice_to_text return Bridge().fetch_voice_to_text(voice_file) File "C:\Users\Administrator\chatgpt-on-wechat\bridge\bridge.py", line 48, in fetch_voice_to_text return self.get_bot("voice_to_text").voiceToText(voiceFile) File "C:\Users\Administrator\chatgpt-on-wechat\bridge\bridge.py", line 34, in get_bot self.bots[typename] = create_voice(self.btype[typename]) File "C:\Users\Administrator\chatgpt-on-wechat\voice\factory.py", line 13, in create_voice from voice.baidu.baidu_voice import BaiduVoice File "C:\Users\Administrator\chatgpt-on-wechat\voice\baidu\baidu_voice.py", line 14, in from voice.audio_convert import get_pcm_from_wav File "C:\Users\Administrator\chatgpt-on-wechat\voice\audio_convert.py", line 4, in import pysilk File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pysilk__init__.py", line 18, in from pysilk.backends.cython import File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pysilk\backends\cython__init__.py", line 4, in from pysilk.backends.cython._silk import * ModuleNotFoundError: No module named 'pysilk.backends.cython._silk' 这个怎么解决，哪位大佬教一下

jones-so commented 1 year ago

我在google 的cloud run上部署了这个项目来对接企业微信的自建应用，文本已经可以正常接发了。但是我设置"speech_recognition": true, 然后发送语音，按说明，应该是会调用默认的openai来识别语音并回复文字，但收到的回复是： [ERROR] Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']

请问要如何解决呢？十分感谢

老兄，这个问题怎么解决的呀，我也遇到了

andyzlys commented 1 year ago

@jones-so 企业应用号需要ffmpeg，并安装amr编码器，前面有个大神已经回复了。实测可行

jones-so commented 1 year ago

您的邮件创视已经收到，请等待我给您的回复，祝您开心！

AssassinJY commented 1 year ago

google 不走代理吗，超时了

jones-so commented 1 year ago

您的邮件创视已经收到，请等待我给您的回复，祝您开心！

KunBoy5240 commented 1 year ago

fetch_text_to_voice函数中的“baidu“换成"google"

我的代码没看到什么’百度‘，而是这样的def fetch_text_to_voice(self, text) -> Reply: return self.get_bot("text_to_voice").textToVoice(text)发了语音过去，查看后台信息是把我的语音识别成了文字，可是一直回复不了语音，一直提示回复任意字获取

KunBoy5240 commented 1 year ago

请问你用的是啥通道能看到Google的语音回复？好像我的wechatmp这个通道不能使用语音识别？多次显示http状态吗200之后就直接错误了[ERROR] Failed to connect. Probable cause: Unknown不知道啥请款。 @zhayujie 能帮解惑一下嘛

AnCool-OvO commented 8 months ago

为什么我在微信中使用回复的还是mp3文件呀

jones-so commented 8 months ago

您的邮件创视已经收到，请等待我给您的回复，祝您开心！

AnCool-OvO commented 8 months ago

使用baidu-aip成功了，不过跟想象中的不一样，对话返回的是个MP3音频文件

按照楼主步骤来没问题，补充一下百度baidu_app_id和baidu_api_key获取方式： 1、注册登录百度智能云，百度云账号就可以https://login.bce.baidu.com/ 2、在控制台搜索语音技术 3 、创建应用，按照选择操作指引来领取免费赠品需要语音合成 4、复制参数到配置文件，可以直接尝试语音交易了

请问你这个解决了吗，我也是遇到这样的问题返回的不是微信语音而是mp3文件

1600858489 commented 5 months ago

Start auto replying. [INFO][2023-03-12 22:46:58][openai_voice.py:22] - [Openai] voiceToText text=趕快睡覺吧,別搞了 voice file name=tmp/230312-224657.mp3 [INFO][2023-03-12 22:46:58][chat_gpt_bot.py:27] - [OPEN_AI] query=趕快睡覺吧,別搞了 [ERROR][2023-03-12 22:47:03][baidu_voice.py:35] - [Baidu] textToVoice error={'err_detail': '16: Open api characters limit reached', 'err_msg': '16: Open api characters limit reached', 'err_no': 502, 'err_subcode': 16, 'tts_logid': 3749716633} [INFO][2023-03-12 22:47:03][wechat_channel.py:149] - [WX] sendFile=None, receiver=@590c249050a578793750164dda76be47

这个语音错误是杂回事啊

我也是这个问题，请问您解决了吗，我用的是百度的短语音识别api

jones-so commented 5 months ago

您的邮件创视已经收到，请等待我给您的回复，祝您开心！

spacex-3 commented 6 days ago

依赖都安装了，收到语音还是提示错误，请问如何解决呀？应该是默认用的tts-1和whisper-1模型吧？我也没其他设置

[DEBUG][2024-09-18 22:03:15][chat_channel.py:190] - [chat_channel] ready to handle context: type=VOICE, content=tmp/240918-220315.mp3 [INFO][2024-09-18 22:03:16][bridge.py:68] - create bot for voice_to_text [ERROR][2024-09-18 22:03:16][chat_channel.py:303] - Worker return exception: Traceback (most recent call last): File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run result = self.fn(*self.args, **self.kwargs) File "/root/chatgpt-on-wechat/channel/chat_channel.py", line 170, in _handle reply = self._generate_reply(context) File "/root/chatgpt-on-wechat/channel/chat_channel.py", line 205, in _generate_reply reply = super().build_voice_to_text(wav_path) File "/root/chatgpt-on-wechat/channel/channel.py", line 41, in build_voice_to_text return Bridge().fetch_voice_to_text(voice_file) File "/root/chatgpt-on-wechat/bridge/bridge.py", line 86, in fetch_voice_to_text return self.get_bot("voice_to_text").voiceToText(voiceFile) File "/root/chatgpt-on-wechat/bridge/bridge.py", line 72, in get_bot self.bots[typename] = create_voice(self.btype[typename]) File "/root/chatgpt-on-wechat/voice/factory.py", line 53, in create_voice raise RuntimeError RuntimeError

jones-so commented 6 days ago

您的邮件创视已经收到，请等待我给您的回复，祝您开心！

zhayujie / chatgpt-on-wechat