huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.22k stars 26.84k forks source link

audio pipeline utility ffmpeg_microphone_live only works with PulseAudio in Linux (KDE Manjaro) #32660

Closed Dakai closed 3 weeks ago

Dakai commented 2 months ago

System Info

Who can help?

@Narsil

Information

Tasks

Reproduction

follow https://huggingface.co/learn/audio-course/chapter7/voice-assistant

Expected behavior

The course's code running in my local Linux machine does not record anything, I tried to use the latest ffmpeg to record the sound with command ffpmeg -f alsa -i default -t 5 test.wav and got an empty wav file, then ffmpeg -f pulse -i default -t 5 test.wav worked perfectly.

So I edited the file, venv/lib/python3.12/site-packages/transformers/pipelines/audio_utils.py at line 30, changed fomat_ = "alsa" into format_ = "pulse" , then the code works as expected and successfully recognised the wake word.

Can you please look into this issue with Linux audio system and make it compatible with both ALSA and PulseAudio?

amyeroberts commented 2 months ago

@ylacombe

ylacombe commented 1 month ago

Hey @Dakai, thanks for opening this issue! Looks like it might be an issue with your KDE, have you been able to use Alsa in your terminal?

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.