Work without PulseAudio

ndarilek commented 6 years ago

I'm attempting to build a Mycroft addon for Home Assistant's Hass.io container-based distribution. Unlike my experience doing this for previous RPi models, Hass.io's audio works for me out-of-the-box. Whereas before I needed PulseAudio to keep my HDMI audio open to not cut off the beginning of spoken utterances, Hass.io provides flawless spoken feedback without Pulse. It also makes installing PA very difficult since everything is done within containers, and I'd really like to avoid adding an audio server if it isn't needed.

Unfortunately, while Mycroft TTS works flawlessly with no additional configuration, I can't get the wake word to work at all. I've confirmed that arecord captures audio from the mic by default, with no additional configuration. For some reason, though, Mycroft seems to require PA for input but not for output.

Is there a configuration setting I might use to tell Mycroft to use Alsa's default capture device rather than trying to go through PA? Pulse will unfortunately be a non-starter for this configuration, and again, everything else (including Mycroft's TTS) is working fine without it. I understand that audio configuration is a tricky thing to get right, but Hass.io lets you configure the default audio inputs/outputs for a given addon from its dashboard. Seems like, if I can hand Mycroft a working set of defaults, that I should at least be able to configure it to use those and skip PA entirely.

Thanks.

forslund commented 6 years ago

As far as I know there's no hard requirement for pulseaudio, The input stack for mycroft is built on the shoulders of giants: Mycroft->speech_recognition->PythonAudio->PortAudio

And PortAudio should be able to work directly on top of Alsa. That said, I've not actually tried it myself.

import pyaudio
p = pyaudio.PyAudio()
for i in range(p.get_host_api_info_by_index(0).get('deviceCount')):
        if (p.get_device_info_by_host_api_device_index(0, i).get('maxInputChannels')) > 0:
            print("Input Device id ", i, " - ", p.get_device_info_by_host_api_device_index(0, i).get('name'))

Which should list the input devices found.

If that lists a device you could try to force the usage of that device by setting the device_index parameter in the listener part of the config (can for example be created as /etc/mycroft/mycroft.conf:

{
  "listener": {
    "device_index": X
  }
}

Where X is the id reported by the script above.

This is all theory and I might be wrong.

ndarilek commented 6 years ago

Having a hard time getting this working with PulseAudio on my custom platform, and am considering just going direct. What would I lose if I did this?

It looks like Mycroft may duck non-voice audio if everything is using Pulse. Is that accurate or am I misremembering/misreading?

Thanks.

forslund commented 6 years ago

No pulseaudio features are used by default so you won't lose anything.

(The pulse audio ducking needs to be enabled in config.)

ndarilek commented 6 years ago

Oh, heh, spent the last few weeks getting a PA server sorta working on Hassio because I thought I'd get ducking by default. :) I'll scrap that then, it's giving me lots of latency and only seems to work once or twice if I'm lucky.

So I run the snippet in question, and the device with index 2 is my mic:

Input Device id 2 - USB Camera-B4.09.24.1: Audio (hw:1,0)

/etc/asound.conf as generated by hassio reads:

pcm.!default {
  type asym
  capture.pcm "mic"
  playback.pcm "speaker"
}
pcm.mic {
  type plug
  slave {
    pcm "hw:1,0"
  }
}
pcm.speaker {
  type plug
  slave {
    pcm "hw:0,0"
  }
}

Which appears to set my defaults correctly--input on hw:1,0 and output on hw:0,0.

I set:

{
  "listener": {
    "device_index": 2
  }
}

in my config. This gives the following error:

Traceback (most recent call last):
  File "/opt/venvs/mycroft-core/bin/mycroft-speech-client", line 11, in <module>
    load_entry_point('mycroft-core==18.2.12', 'console_scripts', 'mycroft-speech-client')()
  File "/opt/venvs/mycroft-core/lib/python3.4/site-packages/mycroft/client/speech/main.py", line 149, in main
    loop = RecognizerLoop()
  File "/opt/venvs/mycroft-core/lib/python3.4/site-packages/mycroft/client/speech/listener.py", line 216, in __init__
    self._load_config()
  File "/opt/venvs/mycroft-core/lib/python3.4/site-packages/mycroft/client/speech/listener.py", line 231, in _load_config
    mute=self.mute_calls > 0)
  File "/opt/venvs/mycroft-core/lib/python3.4/site-packages/mycroft/client/speech/mic.py", line 109, in __init__
    chunk_size=chunk_size)
  File "/opt/venvs/mycroft-core/lib/python3.4/site-packages/speech_recognition/__init__.py", line 84, in __init__
    assert 0 <= device_index < count, "Device index out of range ({} devices available; device index should be between 0 and {} inclusive)".format(count, count - 1)
AssertionError: Device index out of range (0 devices available; device index should be between 0 and -1 inclusive)

So seems like we're almost there. Story of my life. :)

Thanks for helping me debug my nonstandard setup. Hoping to package it as a Hassio addon so folks have another easy way to run Mycroft.

forslund commented 6 years ago

Sounds like something's strange indeed. Did you try the PortAudio snippet I provided above?

ndarilek commented 6 years ago

Yeah, that's how I got the 2 to which I set the device index.

I do still have a PULSE_SERVER environment variable set. I'll try removing that in case the library is trying to connect to my no-longer-running Pulse server, or is using that to determine how many devices are available. That just occurred to me, been trying so many things to get this working that I can't keep everything's state straight.

forslund commented 6 years ago

Right sorry, missed that line :)

Might also be an issue with speech recognition. I'll try to get a docker image going as well and see if I can help.

ndarilek commented 6 years ago

Removing PULSE_SERVER didn't change anything. If I remove the listener config, the error goes away as well, but input still doesn't work.

I've just confirmed that arecord test.wav ... aplay test.wav works (I.e. no need to specify recording or playback devices.) So the issue seems to be that Mycroft isn't detecting the default recording device for some reason.

My setup is a Docker container running the latest Debian armhf packages on an RPi 3 with a USB camera mic and the above /etc/asound.conf setting up the defaults. Here is how I'm setting it up to run in Hassio.

Thanks!

forslund commented 6 years ago

Hmm, I don't get that error. I get an invalid sample rate error instead....

Edit: setting the sample rate to 44100 get me a signal into mycroft at least but it can't detect the wake word.

ndarilek commented 6 years ago

OK, I appear to have cracked it. Was a combination of forgetting to give the Mycroft addon audio permissions (previously Mycroft used Pulse and the Pulse addon had audio permissions) and removing some out-of-date configuration in my local config.

Thanks for the help! I'll see if I can get this packaged and available for the community in the next few days.

ndarilek commented 6 years ago

OK, never mind. I switched away from the Debian packages because they're built with Python 3.4 and I needed Python 3.5. I also thought that building from source would get me closer to a standard, supported configuration. Now things are back to being broken.

I'm running slightly different code than above to list devices:

import pyaudio
p = pyaudio.PyAudio()
for i in range(p.get_device_count()):
    dev = p.get_device_info_by_index(i)
    print((i,dev['name'],dev['maxInputChannels'],dev))
(0, 'bcm2835 ALSA: - (hw:0,0)', 0, {'defaultHighOutputLatency': 0.034829931972789115, 'structVersion': 2, 'defaultLowOutputLatency': 0.005804988662131519, 'maxInputChannels': 0, 'index': 0, 'maxOutputChannels': 
2, 'defaultSampleRate': 44100.0, 'defaultLowInputLatency': -1.0, 'defaultHighInputLatency': -1.0, 'name': 'bcm2835 ALSA: - (hw:0,0)', 'hostApi': 0})
(1, 'sysdefault', 0, {'defaultHighOutputLatency': 0.034829931972789115, 'structVersion': 2, 'defaultLowOutputLatency': 0.005804988662131519, 'maxInputChannels': 0, 'index': 1, 'maxOutputChannels': 128, 'defaultS
ampleRate': 44100.0, 'defaultLowInputLatency': -1.0, 'defaultHighInputLatency': -1.0, 'name': 'sysdefault', 'hostApi': 0})
(2, 'speaker', 0, {'defaultHighOutputLatency': 0.034829931972789115, 'structVersion': 2, 'defaultLowOutputLatency': 0.005804988662131519, 'maxInputChannels': 0, 'index': 2, 'maxOutputChannels': 128, 'defaultSamp
leRate': 44100.0, 'defaultLowInputLatency': -1.0, 'defaultHighInputLatency': -1.0, 'name': 'speaker', 'hostApi': 0})
(3, 'default', 0, {'defaultHighOutputLatency': 0.034829931972789115, 'structVersion': 2, 'defaultLowOutputLatency': 0.005804988662131519, 'maxInputChannels': 0, 'index': 3, 'maxOutputChannels': 128, 'defaultSamp
leRate': 44100.0, 'defaultLowInputLatency': -1.0, 'defaultHighInputLatency': -1.0, 'name': 'default', 'hostApi': 0})

My USB audio isn't anywhere to be found there. But here:

root@02defeca-mycroft:~# arecord -l
 4 star characters List of CAPTURE Hardware Devices  4 star characters
card 1: CameraB409241 [USB Camera-B4.09.24.1], device 0: USB Audio [USB Audio]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

So the card is definitely in Alsa.

Additionally, I was able to build PulseAudio 12.2 for this system. If I set PULSE_SERVER=/share/pulse-socket, I can do:

# pactl info
Server String: /share/pulse-socket
Library Protocol Version: 32
Server Protocol Version: 32
Is Local: yes
Client Index: 2
Tile Size: 65496
User Name: root
Host Name: 02defeca-pulseaudio
Server Name: pulseaudio
Server Version: 12.2
Default Sample Specification: s16le 2ch 44100Hz
Default Channel Map: front-left,front-right
Default Sink: alsa_output.0.stereo-fallback
Default Source: alsa_input.1.analog-surround-40
Cookie: e41d:9e93

The default sink and source are correct. But nothing changes when I run the above Python code, so I'm wondering if something has to happen on the pyaudio side to switch to pulse, or if I need another module included in my pulse configuration to make it play nice with pyaudio. I see something about Jack, and other things about a pulseaudio alsa compatibility layer.

Parecord/paplay/arecord/aplay all work fine.

Running short on ideas here. I've googled things like "portaudio not finding input device" and finding lots of unresolved threads and broken .asoundrc configs. :/

ndarilek commented 6 years ago

Figured out how to make Pulse show up as an input device. I needed the libasound2-plugins package and the following ~/.asoundrc:

pcm.pulse {
    type pulse
}

ctl.pulse {
    type pulse
}

But I'm officially giving up on trying to make this work without Pulse. Looks like PyAudio has many issues finding default input devices, and I was able to containerize a PulseAudio install. With Pulse, the correct asoundrc configuration, and libasound2-plugins, I appear to have audio somewhat working, though I'm hoping to tune it further. Without it, pyaudio just doesn't detect my input device anymore, and I have no clue why this worked for a time but now fails.

MycroftAI / mycroft-core

Work without PulseAudio #1676