Romkabouter / ESP32-Rhasspy-Satellite

The repo has implementing an esp32 standalone MQTT audio streamer. Is is desinged to work as a satellite for Rhasspy (https://rhasspy.readthedocs.io/en/latest/). It supports multiple devices
GNU General Public License v3.0
358 stars 64 forks source link

ESP32-POE-ISO Ethernet Adoption #58

Closed chris-kuhr closed 3 years ago

chris-kuhr commented 3 years ago

Hi,

taking the topic here...

I want to use an ESP32-POE-ISO by Olimex. Is there a way to use this project with the Arduino IDE? I feel a little stupid asking this question, because I have the feeling, I am missing something obvious...

Well, once I have it running I will try to add the missing features for the ESP32-POE-ISO board.

Romkabouter commented 3 years ago

I haven't used Arduino IDE in a while. It should be possible. Maybe rename Satellite.cpp to ino, or create a new ino file and copy the source into it

chris-kuhr commented 3 years ago

It is not possible or at least not sensible to use the Arduino IDE. The project structure and dependencies would need to be completely reworked. I downloaded VSCode and PlatformIO and will try my luck here. I first need to figure out how to work with it.

chris-kuhr commented 3 years ago

Ok, so I am up to speed now.

I have created a device class ESP32_POE_ISO with two separate I2S buses - in and out.

During the adoption of the inmp441 class I found a small bug in the readAudio function: dac_value = ((((uint16_t) (i2s_read_buff[i + 1] & 0xf) << 8) | ((i2s_read_buff[i + 0]))));

There are only 4 bits masked with 0xf. I believe it should be 0xff instead.

I still have to integrate the Ethernet part.

Romkabouter commented 3 years ago

There are only 4 bits masked with 0xf. I believe it should be 0xff instead.

I do know know much about the bit masking I admin, I did not create that device. Apparently it work fine the way it is :) Great you made progress :)

chris-kuhr commented 3 years ago

masking the bits like this -may- result in distortion, which in turn may degrade the recognition accuracy...

chris-kuhr commented 3 years ago

I have integrated the Ethernet part, which is much simpler then WiFi :-) It compiles fine so far. I still have to implement the sample conversion for the readAudio and writeAudio functions. The devices I have operate with 32 and 24 bit per sample.

chris-kuhr commented 3 years ago

Ok. flashed it!

Device receives DHCP lease and I can access the config page of the satellite. Now I have to setup Rhasspy and the satellite correctly. How do I do that? I already followed the manual server-with-satellites. It is not doing anything so far and I have no indication what is wrong.

Do you have some directions for me?

Romkabouter commented 3 years ago

First thing to check if an audio stream is being send. If you subscribe to hermes/audioServer//audioFrame on your broker, there should be a huge amount of messages coming in. These are the small audioframes consumed by Rhasspy

https://rhasspy.readthedocs.io/en/latest/reference/#audioserver_audioframe

chris-kuhr commented 3 years ago

Ok, so now I am here:

ETH Connected
ETH MAC: xx:xx:xx:xx:xx:xx, IPv4: 10.5.0.93, FULL_DUPLEX, 100Mbps
Enter WifiConnected
Connected to LAN with IP: 10.5.0.93, 
Enter MQTTDisconnected
Connecting MQTT: 10.5.0.87, 12183
Enter MQTTConnected
Connected as asr1
Enter Idle

I am connected to the internal MQTT broker, but I don't see any message on the audioFrame topic. Neither with the mosquitto_sub client nor in the packet capture of the rhasspy server.

So I suppose the transmitter task is not running?

Romkabouter commented 3 years ago

Sorry, I see the siteID is mssing from the topic. In your case, the topic is: hermes/audioServer/asr1/audioFrame You can also subscribe to hermes/audioServer/# just in case. You also put a debug message in the task, but I am sure it is running :)

chris-kuhr commented 3 years ago

I used the following command: mosquitto_sub -h 10.5.0.87 -p 12183 -t hermes/audioServer/asr1/audioFrame I was just too lazy to write it out :-)

Romkabouter commented 3 years ago

What are your settings in the satellite? hotword should be remote.

You can search for if (config.hotword_detection == HW_REMOTE)

and put some printlines there to debug

chris-kuhr commented 3 years ago

Thank you for the advice!

Now I have it working. It got stuck with the i2s_read function. Well, at least I see audio data on the mqtt topic. But Rhasppy doesn't seem to receive it internally.

What do I have to configure to send a text to speech command from Rhasspy?

chris-kuhr commented 3 years ago

I just realized that the I2S MEMS mic I am using is no longer available. So, I have bought INMP441 mics, and since I use the same I2S amp, I will stop debugging this new device and stick to the existing device instead. However, I will integrate the Ethernet part and create a PR, once I have it running. It already builds successfully. Now, I have to wait for the INMP441 to arrive...

Romkabouter commented 3 years ago

Great to have ethernet functionality. Shall we close this issue?

chris-kuhr commented 3 years ago

I suggest we close this issue with the upcoming PR, so that we can link it here...

chris-kuhr commented 3 years ago

Ok. The INMP441 has arrived. I tried it and it works the same as with the previous MEMS mic. But I still don't know how to proceed with configuring Raspphy <-> Satellite.

chris-kuhr commented 3 years ago

Created a PR

Romkabouter commented 3 years ago

Ok. The INMP441 has arrived. I tried it and it works the same as with the previous MEMS mic. But I still don't know how to proceed with configuring Raspphy <-> Satellite.

Can you post your Rhasspy settings? I will have a look on the PR, thanks for that!

chris-kuhr commented 3 years ago

image

image

and my settings.ini:

[General]
hostname=asr1
deployhost=10.5.0.3
siteId=asr1
;supported: M5ATOMECHO=0, MATRIXVOICE=1, AUDIOKIT=2, INMP441=3, INMP441MAX98357A=4
device_type=4
; supported ESP_OTHER=0, ESP32_POE_ISO=1
esp_type=1

[Wifi]
ssid=SSID
password=password

[OTA]
;supported: upload, ota, matrix
;-upload: device should be attached to computer via usb
;-ota: will use espota
;-matrix: matrix voice should be attached to a raspberry pi with matrix software.
;         deployhost should be set to ip of the pi
method=upload
password=OTApassword
port=3232

[MQTT]
hostname=10.5.0.87
port=12183
username=
password=
Romkabouter commented 3 years ago

Ok, Rhasspy settings:

In general: add asr1 in all "Satellite siteIds" boxes

chris-kuhr commented 3 years ago

Made those settings. But still, I receive a TimeOut Error in Firefox, the development console shows the following details:

Source map error: Error: request failed with status 500
Resource URL: http://10.5.0.87:12101/js/popper.min.js
Romkabouter commented 3 years ago

A timeout in Rhassy? That is not related to the streamer, so it might be better to state your problems with Rhasspy here: https://community.rhasspy.org/

chris-kuhr commented 3 years ago

I have now verified a working setup for the rhasspy server with my laptop running rhasspy in satellite mode. It works with the following server settings, all siteIds are set. rhasppy_working Here are the satellite settings of my laptop: rhasppy_satellite

However, if I try to use the esp32 satellite nothing happens. the timeout I mentioned above happens when I click the "Wake Up" button on the server. if I do not click it, nothing happens eather.

Here is the serial output from the esp32 from vscode: vscode_serial

The esp is sending data via mqtt: Screenshot at 2021-05-24 09-36-40

How can I verify that the server processes the input from the satellite?

Romkabouter commented 3 years ago

If the first screenshot is from your server, you must enable the wakeword and fill the satId's:

image

The esp32 does not do wakeword detection, is just sends audio to the server. So, the server needs to handle the wakeword. For you laptop, the wakeword is handled by your laptop so that is why that works.

Also, set audio playing on the server to Hermes MQTT, the server then published the audio feedback to MQTT, which the esp32 sat will receive

Have you already implemented some way of intent handing? (Like Home Assistant)

chris-kuhr commented 3 years ago

Hi,

it is still not working. I used a wrong I2s config, which I've fixed now. I also made the configs that you suggested. However, I am not getting any reaction from the base station. I am just trying to get audio in and audio out at this point. I have read your thread on the community forum. This post: https://community.rhasspy.org/t/rebranded-the-matrix-voice-to-esp32-rhasspy-satellite/2169/21 mentions log entries when the satellite connect to the MQTT broker. I don't see those. In fact, I don't see anything in there. I am running rhasspy 2.5.10. Do you have any experience with this version?

Romkabouter commented 3 years ago

Yes, should work.

You get audioFrames on the broker right? Try this file https://github.com/Romkabouter/ESP32-Rhasspy-Satellite/blob/voco/record.py Set your connections and start it. It records a couple of seconds from the hermes/audioServer and saves it to a wavefile.

Check if you get good audio.

Can you repost you server settings? Also, please include what is set in the Satellite Id's fields

chris-kuhr commented 3 years ago

Thank you! This recording tool was my missing link. Now I have found the bugs in my i2s read implementation. i2s write is still off though. I can now receive the audio in rassphy. It recognizes the keyword and also the command.

Thank you for your help!

Romkabouter commented 3 years ago

Let me know if I can close this issue or if you need more assistance :)

chris-kuhr commented 3 years ago

Well, I still have the issue with the bad playback sound. I used your Python script to record 'hermes/audioServer/asr1/playBytes/#'. The beep wave forms I record sound very distorted on my laptop (beyond recognition), but they sound the same as from the speaker connected to the max98375 i2s amp, apart from some high frequency ringing noise.

First question, do you have any idea what might cause the high frequency problems?

I have used my laptop with a docker installation and the ESP32 as satellites in parallel. The laptop receives clean sound, while the ESP32 sound is distorted. The laptop plays the sound much faster.

So, this might be some problem related to the sample rate. I tried recording on my laptop with different ones (16kHz, 22.5kHz, 32kHz, 44.1kHz, 48kHz and even 96kHz), but they all have the same distortion, only differently pitched.

Is there any configuration (mismatch?) between core and satellite, that might affect this?

Apart from that, the ESP32-PoE-Iso is working now.

Romkabouter commented 3 years ago

If the sound is distorted when you record it from hermes/audioServer/asr1/playBytes/#, then there is an issue with the sound itself. The playBytes is for playing audio, not recording so I am a bit confused on what you are trying and what you are expecting.

Which high frequency problems are you referring to? Playback on esp32 is distorted when

Basically what happens is: I use a (fairly large) ringbuffer, but the aSynch MQTT is pushing data faster in the buffer than the audio should play it I thought I had that covered by adding a delay in while the buffer was full. But this causes the distortion, because the aSync client does it's thing regardless.

You can read about that here: https://github.com/Romkabouter/ESP32-Rhasspy-Satellite/issues/53#issuecomment-855282411

I have not yet found a solution

chris-kuhr commented 3 years ago

When I record the topic, I expect that I listen to the feedback sounds, that the rhasspy server sends to the satellites. Is this correct? If so, I can conclude that the i2s part on my ESP is working, since laptop recording and esp satellites play the same, but distorted sound. This also implies, that the audio coming from the server is misbehaving.

I run the esp at 16kHz input and output. I just experimented with sample rates with the record script on my laptop. The noise I hear is high frequency ringing, nothing harmonic, not stationary. Maybe the amp draws to much power and saturates very quickly? I have to test that.

chris-kuhr commented 3 years ago

no saturation problem. I hook up an additional batterie. still the same. Maybe an aliasing effect...

Romkabouter commented 3 years ago

When I record the topic, I expect that I listen to the feedback sounds, that the rhasspy server sends to the satellites. Is this correct?

Yes, it is.

If so, I can conclude that the i2s part on my ESP is working, since laptop recording and esp satellites play the same, but distorted sound.

Yes, but be sure to test with a samplerate 22050 (or less) sound with a length of a small sentence (like 1 or 2 seconds). Then there should be no distortion on the audio on the esp32.

The noise I hear is high frequency ringing, nothing harmonic, not stationary. Maybe the amp draws to much power and saturates very quickly?

That is strange, do you hear that only on audio playback or the whole time?

chris-kuhr commented 3 years ago

If so, I can conclude that the i2s part on my ESP is working, since laptop recording and esp satellites play the same, but distorted sound.

Yes, but be sure to test with a samplerate 22050 (or less) sound with a length of a small sentence (like 1 or 2 seconds). Then there should be no distortion on the audio on the esp32.

I tested only with the standard beep feedback sounds at 16kHz. They are smaller than 1 second.

The noise I hear is high frequency ringing, nothing harmonic, not stationary. Maybe the amp draws to much power and saturates very quickly?

That is strange, do you hear that only on audio playback or the whole time? Only on playback. However, sometimes it seems to get caught in a feedback loop and stays present. It sound like very high (8kHz or so) sine waves. Squeaky...

I will try an RC low pass filter between amp and speaker. I think this could be aliasing noise.

Romkabouter commented 3 years ago

Please check https://github.com/Romkabouter/ESP32-Rhasspy-Satellite/releases/tag/v7.6

Audio playback is most probably fixed, reopen if you feel like it :)