关于3s极速复刻，prompt参考音频格式bug

ZHUHF123 commented 1 month ago

想知道对于prompt音频格式有什么要求，输入的32khz采样率的音频报错 RuntimeError: Cannot load audio from file: ffprobe not found. Please install ffmpeg in your system to use non-WAV audio file formats and make sure ffprobe is in your PATH. 显示输入的文件是非音频文件但是输入采样率16khz和24khz的能正常生成，是音频采样率除了不低于16khz还有别的限制吗

aluminumbox commented 1 month ago

this is due to audio file format, it is not wav format, please first convert it to wav format. what audio format are you using?

ZHUHF123 commented 1 month ago

this is due to audio file format, it is not wav format, please first convert it to wav format. what audio format are you using?

用的是.wav文件报错，这个.wav文件和别的不报错的.wav文件比他是32khz的采样率，所以我想问prompt音频格式支持的采样率范围是多少

aluminumbox commented 1 month ago

this is due to audio file format, it is not wav format, please first convert it to wav format. what audio format are you using?

用的是.wav文件报错，这个.wav文件和别的不报错的.wav文件比他是32khz的采样率，所以我想问prompt音频格式支持的采样率范围是多少

your file may end with .wav, but the log shows that it is not wav format. greater than 16khz is ok

ZHUHF123 commented 1 month ago

this is due to audio file format, it is not wav format, please first convert it to wav format. what audio format are you using?

用的是.wav文件报错，这个.wav文件和别的不报错的.wav文件比他是32khz的采样率，所以我想问prompt音频格式支持的采样率范围是多少

your file may end with .wav, but the log shows that it is not wav format. greater than 16khz is ok

是能播放的.wav文件，我该怎么核实一下这个音频的log是否是wav format呢

ZHUHF123 commented 1 month ago

完整报错如下

RuntimeWarning: Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work warn("Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work", RuntimeWarning) Traceback (most recent call last): File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/gradio/processing_utils.py", line 544, in audio_from_file audio = AudioSegment.from_file(filename) File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/pydub/audio_segment.py", line 728, in from_file info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit) File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/pydub/utils.py", line 274, in mediainfo_json res = Popen(command, stdin=stdin_parameter, stdout=PIPE, stderr=PIPE) File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/subprocess.py", line 858, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/subprocess.py", line 1720, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'ffprobe'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/gradio/queueing.py", line 521, in process_events response = await route_utils.call_process_api( File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/gradio/route_utils.py", line 276, in call_process_api output = await app.get_blocks().process_api( File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 1941, in process_api inputs = await self.preprocess_data( File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 1655, in preprocess_data processed_input.append(block.preprocess(inputs_cached)) File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/gradio/components/audio.py", line 218, in preprocess File "/home/lixiufeng/.conda/envs/cosyvoice/lib/python3.8/site-packages/gradio/processing_utils.py", line 554, in audio_from_file raise RuntimeError(msg) from e RuntimeError: Cannot load audio from file: ffprobe not found. Please install ffmpeg in your system to use non-WAV audio file formats and make sure ffprobe is in your PATH.

aluminumbox commented 1 month ago

this is due to audio file format, it is not wav format, please first convert it to wav format. what audio format are you using?

用的是.wav文件报错，这个.wav文件和别的不报错的.wav文件比他是32khz的采样率，所以我想问prompt音频格式支持的采样率范围是多少

your file may end with .wav, but the log shows that it is not wav format. greater than 16khz is ok

是能播放的.wav文件，我该怎么核实一下这个音频的log是否是wav format呢

try torchaudio.load(wav) to see whether there is error

ZHUHF123 commented 1 month ago

this is due to audio file format, it is not wav format, please first convert it to wav format. what audio format are you using?

用的是.wav文件报错，这个.wav文件和别的不报错的.wav文件比他是32khz的采样率，所以我想问prompt音频格式支持的采样率范围是多少

your file may end with .wav, but the log shows that it is not wav format. greater than 16khz is ok

是能播放的.wav文件，我该怎么核实一下这个音频的log是否是wav format呢

try torchaudio.load(wav) to see whether there is error

torchaudio.load都可以正常print出来，格式都是（tensor,采样率）

qxde01 commented 1 month ago

apt install ffmpeg

FunAudioLLM / CosyVoice

关于3s极速复刻，prompt参考音频格式bug #82