freeDSP / freeDSP-aurora

freeDSP ADAU1452 with 8 analog input, 8 analog outputs, S/P-DIF I/O, ADAT I/O, USB Audio Class2, WiFi, Bluetooth
Creative Commons Attribution Share Alike 4.0 International
176 stars 55 forks source link

Sound card controller / IC crashes if switched on for over 36 hours. #123

Closed exislow closed 1 year ago

exislow commented 2 years ago

I am running the latest v2.2.2 firmware on the v1 aurora hardware. My Raspberry Pi 3b is attached via USB to the aurora DSP board and uses it as a soundcard to stream music to it.

Everytime the aurora is running for 36 - 48 hours or more my Raspberry Pi is not able to stream audio anymore to the aurora using it as a soundcard. I assume the IC, which is providing the sound card functionility crashes or something. Also a reboot of the Raspberry Pi does not help. Only if I switch the aurora off and on again my Raspberry Pi is able to stream music.

Is this an already known problem? Does anybody has a better solution than rebooting the aurora everytime this happens?

My Raspberry Pi is currently running this distro: https://www.picoreplayer.org/ It is like any other music player distro: It simply runs a squeezelite instance to handle music input and using alsa to output / stream the audio.

exislow commented 2 years ago

@dspverden any comment is appreciated.

dspverden commented 2 years ago

oh sorry, I thought I had replied already. Maybe on another forum? Anyway, there are too many degrees of freedom. I would do this: Take the aurora board and connect the UAC2 input to a PC and play some music. Then let it run for 48 hours. If that succeeds, you know, that the problem is the USB stack of the RaspPi, if it fails you know that the XMOS framework has a problem.

exislow commented 2 years ago

I have not stated this issue somewhere else. Maybe you are confusing me with someone else. But also maybe this shows, that this issue is real.

Thank you for your advice. So I did somethink similar, since I had issues moving my aurora as it is kind of burried in my hifi tower.

  1. I let the Raspberry Pi stream music to the aurora (connected via USB using the soundard functionality) until no sound from the aurora is was forwarded to the amps.
  2. Then I have restarted my streaming daemon (squeezelite) of my Raspbery Pi installation. Still no sound.
  3. I have disconnected the USB link to the aurora from the Raspberry Pi, connected it again and restarted squeezelite. Still no sound.
  4. I have rebootet the aurora board and restarted the squeezelite daemon. Et voila, sound is playing.

Do you agree, that this seems to be an XMOS framework issue?

dspverden commented 2 years ago

Well you never know. It could be that the RasPi sends a USB packet that makes the XMOS hanging. Therefore, try it with another platform. Then we can be sure that it is something like a counter overflow. Two questions: 1) Did you try to reboot the RasPi instead of Aurora? 2) When USB audio is not playing anymore, does Aurora still play via other inputs (e.g. analog)?

exislow commented 2 years ago

Let me first answer your questions:

  1. I did. No effect, still no sound playing.
  2. It does. I can select other presets, where I have some analog sources as input and it forwards the audio stream to my amps.

You are right about the USB. I will try it with my MacBook over night.

dspverden commented 1 year ago

Any news on this? What was the result with the MacBook over night?

exislow commented 1 year ago

Not, yet. I had some issues with the hibernate mode, business trips etc. But I do plan to have reliable results by end of the week.

exislow commented 1 year ago

@dspverden you have been probably right... I have let my MacBook Pro stream music for 36+ hours using the aurora as a soundcard via USB. No crash happened. The MacBook Pro was still able to stream stream music, which was forwarded by the aurora to my amps.

This is probably related to the RPi USB stack or something. How can we analyze this problem further? It would also be beneficial to implement something like a "crash protection" in the aurora code base. Do you think something like this is possible to overcome this issue for all the RaspberryPi users?

archi commented 1 year ago

You could try sniffing the USB traffic with tcpdump/wireshark to figure out where things go awry: https://technolinchpin.wordpress.com/2015/10/23/usb-bus-sniffers-for-linux-system/ Though 36-48h of music streaming might be a lot of raw USB traffic, and I'd be surprised if the Pi decides "now I'll send a malformed packet at random".

USB 2.0 itself doesn't seem to have sequence numbers, but maybe UAC 2.0 does (could not find info on the packet layout). Maybe these overflow after x hours. Maybe the Linux UAC stack then does something the XMOS doesn't expect (start at the wrong value, duplicates a sequence number,...) and locks up. A 32 bit signed(!) counter incrementing from 0 to INT32_MAX at 16kHz overflows after 37.28 hours (2^31/3600/16000), allows for misinterpretation (maybe one side thinks it's unsigned; or the other way around), different behavior (continue at 0 or INT32_MIN) and only happens after a relatively long time with a firmware that cares for 100% correct sequence numbers. Could be a coincidence, but it's well within your 36-48 hours.

A quick google search suggests UAC 2.0 bursts audio data to the XMOS in isochronous mode at some clock (XMOS gives an example of 8000 Hz for 96kHz/24bit audio). The 8kHz example fits up to 10 channels at 192/32; in which case the overflow would happen after ~74.6 hours (signed 32 bit) or ~150h (unsigned 32bit) when doing output only. However, when doing input/output twice the amount of data has to be transferred (-> 16kHz!), and/or the UAC stack on the host might pick a higher clock for less latency (wasting some USB bandwith).

I'm using the Aurora in my WFH setup, so I can't run longterm experiments, but maybe on the weekend I can setup my Linux PC to play some nonstop music and see if this also happens on a recent kernel on x86_64.

dspverden commented 1 year ago

This is probably related to the RPi USB stack or something. How can we analyze this problem further? It would also be beneficial to implement something like a "crash protection" in the aurora code base. Do you think something like this is possible to overcome this issue for all the RaspberryPi users?

well I would prefer to remove the root of problem instead of working on the symptoms. ;-) I think making it on Aurora might be difficult. You could build in something like a watchdog inside the XMOS, that reboots/resets the XMOS in case of an error of the RasPi

exislow commented 1 year ago

So there must be something on the Raspberry Pi USB stack or any drivers used by Pi Core Player distro (https://www.picoreplayer.org/) which let the XMOS crash within a short time frame. I basically never turn off my Aurora. I just mute it, when I do not use it. Thus, my Raspberry Pis continues to stream whatever intern radio station is just playing.

Streaming with my MacBook never caused such crashes.

@dspverden: I totally agree with you! Problems should be fixed at the root cause. At the moment I cannot really estimate, if this is just a Pi Core Player distro problem or a Raspbian problem or maybe even resides inside the Raspberry Pi USB driver. Thus, while the root case is not known at the moment I would vote a) for a wathdog solution in XMOS (bonus: Pushes fail safety / fault tolerance within the Aurora) and b) doing more investigation as @archi suggested.

@archi: I definately need to get more background knowledge regarding USB to be able to further analyze any wireshark logs.

dspverden commented 1 year ago

Well I think the next step would be to go back to a standard RasPi distribution to find out if the problem is caused by Pi Core player or the USB stack itself.

Btw. Do you need 8 channels or just stereo? If stereo you could go via I2S of RasPi as a workaround.

exislow commented 1 year ago

I will install a stock Raspbian image to my Raspberry Pi and let it stream audio for couple of days to verify this.

I just need stereo input and 6 channel output. I need to hardware Raspberry Pi's I2S pins with the Aurora's I2S (X102) on the PCB, correct? What Raspbian dt-overlay would you recommend to use then?

dspverden commented 1 year ago

I am not so familiar with the RasPi. Wanted to build a medal server with RasPi+Aurora but first the RasPi wasn't available and if I look now at the prices, I think RasPi is not the route to go. Therefore, no idea of the internals of RasPi. Sorry. But yes, connect the I2S as you said. You may need to configure the I2S receiver in SigmaStudio for it and recompile the project. Perhaps we can do that from the outside by writing to some register. In a few days I will hopefully get an Arylic streamer from somebody. once this device is here, I can help with that.

exislow commented 1 year ago

I really appreaciate your effort! Thank you very much in advance. Unfortunately I do not have a spare Raspberry Pi left, otherwise I could send it to you. But anyway, maybe the Arylic streamer can help as well.

exislow commented 1 year ago

I have installed installed (HiFiBerryOS)[hifiberry.com], which is basically a stripped down Raspbian OS optimized for music playback. Same issues as before: After 12-24 hours the USB sound within the Aurora crashes. The Aurora needs to be restartet (mains off & on) to be able to stream music using the USB sound card function. I do not think that this is an distro related bug based on only one distro but more something Raspberry Pi (Raspbian OS, USB driver, alsa etc.) related. What do you think?

dspverden commented 1 year ago

Looks like it. I don't know how RasPi-Distros are built, but on Intel/AMD machines Linux distributions optimized for audio come with a special compiled RT-kernel (sometimes). May this be an issue on RasPi as well? Do you have a Linux-PC at home? You could use that to see wether the issue is Linux related. My Linux-PC is currently broken and had no time yet to replace the broken hardware. :(

exislow commented 1 year ago

the latest HiFiBerryOS uses this kernel

# uname -a
Linux hifiberry 5.15.56-v7 #1 SMP Thu Aug 25 10:09:45 UTC 2022 armv7l GNU/Linux

It does not look like an RT / preempt kernel. I have no spare linux PC at home. I guess installing a Linux virtual machine on my macOS VMWare client wouldn't deliver comparable results, would it? Otherwise I will try to organize a spare Intel PC and install Debian on it, since Raspbian is based on Debian.

dspverden commented 1 year ago

I have good experiences with Linux on macOS via VirtualBox, but only on Intel machines. Since I switched to Apple Silicon I haven't installed a Linux yet.

exislow commented 1 year ago

So I got myself an Lenovo ThinkPad T470s and booted a live image of the current Debian 11 with XFCE desktop. I have connected the Aurora to the laptop and have streamed music for 48 hours to the Aurora and it still works!

This somehow shows the crash is Raspberry Pi, ARM or Raspbian related? I don't know. How should we procceed?

dspverden commented 1 year ago

We can conclude from this, that it is not related to Aurora itself. I think debugging the RasPi UAC2 stack is beyond the scope of Aurora community. Perhaps you can ask in the RasPi community regarding this? Regarding a workaround: This might be difficult, because I have not idea how to detect that the USB crashed inside the XMOS code. At least nothing to be implemented quick and dirty. Perhaps this: Return the audio to the RasPi and let the RasPi check, when it is playing music, that there should be a signal on the return path. Return path could be implemented via USB, too. Once the RasPi detects no input signal anymore, it restarts the Aurora?

exislow commented 1 year ago

Thank you very much for your support. I have created a post in the Raspberry Pi forum: https://forums.raspberrypi.com/viewtopic.php?t=341609

Do you have some sources for me regarding XMOS programming? I think, this would be the preferable way for me besides finding a solution for the root cause within the Raspberry Pi UAC2 stack.

You are free to close this issue.

dspverden commented 1 year ago

Actually, XMOS provides all lot of documentation on their website. You will find a deep description of the UAC2 framework there, too.