SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
11.9k stars 1k forks source link

pcm format #579

Open wwfcnu opened 10 months ago

wwfcnu commented 10 months ago

support model.transcribe(example.pcm)?

phineas-pta commented 10 months ago

any audio waveform as numpy array should work

wwfcnu commented 10 months ago

any audio waveform as numpy array should work任何作为 numpy 数组的音频波形都应该有效 But when I input a file in PCM format, the load_audio function encounters an error during the ffmpeg processing

wwfcnu commented 10 months ago
subprocess.CalledProcessError: Command '['ffmpeg', '-nostdin', '-threads', '0', '-i', '/mnt/data/asr_datasets/online_audio/20231018/000000b9-7ba0-40fd-994b-a0d0b53a782d.pcm', '-f', 's
16le', '-ac', '1', '-acodec', 'pcm_s16le', '-ar', '16000', '-']' returned non-zero exit status 1.
phineas-pta commented 10 months ago

i said numpy array not pcm file

wwfcnu commented 10 months ago

How ffmpeg handles pcm format

aonoa commented 8 months ago

ffmpeg如何处理pcm格式

https://github.com/collabora/WhisperLive/blob/main/whisper_live/client.py#L410 https://github.com/collabora/WhisperLive/blob/main/whisper_live/client.py#L267