respeaker / seeed-voicecard

2 Mic Hat, 4 Mic Array, 6-Mic Circular Array Kit, and 4-Mic Linear Array Kit for Raspberry Pi
GNU General Public License v3.0
470 stars 284 forks source link

kernel panic - Spinlock issue / "BUG: scheduling while atomic" #251

Open HinTak opened 3 years ago

HinTak commented 3 years ago

First reported in https://github.com/respeaker/seeed-voicecard/issues/246#issuecomment-687611280 , haven't been tackled in #249 .

Pillar1989 commented 3 years ago

@HinTak This does sacrifice some performance. But there is no better way. Do you have any Suggestions?

HinTak commented 3 years ago

@Pillar1989 the injector octo driver has an option to run the clock continuously (unless at power-saving mode under the kernel's power management), rather than at stream start/stop. This might workaround the spinlock issue, plus have the advantage of fixing channel sync (where the 8 channels shifts by two). At a disadvantage of possibly higher power consumption.

Anyway, that code just looks wrong - read chapter 5 of the linux device driver book. It is freely available online.

cc @j1nx

HinTak commented 3 years ago

@Pillar1989 it is also NOT merely a performance issue. Every time the kernel's exception mechanism is triggered, there is no guarantee that the internal state of the driver is consistent.

turmary commented 3 years ago

@HinTak the injector octo sound card have 4 gpios to control the sample rates, and also the clock start(gpios != 0b0000)/stop(gpios = 0b0000). I think it will also miss channels order sometimes with option non_stop_clocks setting to 1.

We haven't the same hardware design, so we have to start/stop clock in function XXX_trigger by i2c access to sync channels order. I know I2C access is a too long path which cause the problem "BUG: scheduling while atomic".

Do you have any idea about dealing with these limitations?

HinTak commented 3 years ago

@turmary - I think the respeaker driver may be starting / stop the clock too often. The "BUG: scheduling while atomic" is partly due to that, so changing to more like how the Octo card does it can help. There is also why the respeaker driver needs to hammer on the I2S so often, and then have usleep() in a few places to slow it down! (instead the driver can cache response from the hardware and shield the hardware from too frequent access from alsa, without using usleep()).

Yes, that octo card option is mainly for channel +2/-2 shifts, which the respeaker also suffers from, I think, in one of the closed-without-resolving bugs.

HinTak commented 3 years ago

Just remember this, https://github.com/raspberrypi/linux/issues/3580 , likely related as both concern scheduling.

HinTak commented 3 years ago

This seems to make the mic-array quite unuseable - I can get my Pi (headless Ubuntu 20.04.1) to crash by ssh'ing into it. I guess the sshd demon is sufficiently high-priority that it definitely causes context switches, and so whenever I ssh'ed into the Pi, it immediately dumps the whole lot of critical kernel logs to my console window and crashes.

Pillar1989 commented 3 years ago

@HinTak We hope to solve this problem internally, but when we solve this problem, we will bring other problems. In addition, we have learned from the supplier that AC108 will also face the problem of EOL. We will focus more on the V2 version, and we will consider using TI's multi-channel Audio ADC. I hope to have a perfect, or at least satisfying, solution in V2.

Pillar1989 commented 3 years ago

In addition, I also hope you can make Suggestions on the chip selection of the new scheme. The V2 version will be completely open-source on both hardware and software. @HinTak I hope that the design for the V2 version was a community choice, and Seeed just helped make it happen.

HinTak commented 3 years ago

@Pillar1989 FWIW, if you can't/won't fix problems with V1, from the customers' point of view, it is hard to justify buying v2. Put it bluntly: I may not even want to spend time on it, even if you send v2 to me free!

Pillar1989 commented 3 years ago

We will definitely fix this before V2 is released. Because we also have this problem with V2, our hardware will have this problem whenever we use 8-channel PCM signals. @HinTak

HinTak commented 3 years ago

@Pillar1989 by the way, what's your estimate of the timescale for arrival of v2 (prototype or shipped product)? 3 months, 6 months, 1 year, 2 years?

HinTak commented 3 years ago

Thanks to one of you side-tracking me onto berryboot, I managed to track down the specific kernel config which interacts badly with the respeaker driver to cause kernel panic and crashes. The bad news is, unfortunately, that's the Ubuntu (both 32-bit and 64-bit) and the Raspbian 64-bit default. So the only way without using a non-distro kernel to avoid the crash, is to stay with raspbian 32-bit, out of the 4 combinations of ubuntu / raspbian x 32-bit / 64-bit . cc @j1nx @j1nx @younes-professor @Daenara @tomh05 @joshuajaharwood @h4de5 @faaafo @Tom-Lu @lxne

On the other hand, I really want to have a non-crashing ubuntu 64-bit instance, so I made a largely-compatible set of ubuntu kernel packages to the most recent ubuntu focal (20.04), instructions and downloads at https://github.com/HinTak/RaspberryPi-Dev/releases/tag/Ubuntu-raspi-5.4.0-1028.31 . I am likely to upgrade to Ubuntu groovy (20.10) 64-bit soon, so a Ubuntu groovy 64-bit set of kernel packages will likely be made at some point, but I am only going to keep two SD cards ("current" ubuntu 64-bit and raspbian 32-bit). Those of you want to use ubuntu 32-bit or raspbian 64-bit without crashing by the seeed respeaker driver are a bit out-of-luck. I could be pursuaded do a set of ubuntu 32-bit kernel packages, especially if you click donate at https://hintak.github.io ... but even the raspbian people admitt that trying to build the raspbian kernel package their way (with small tweaks) is hard, so I'd need a lot of pursuasion to attempt to build "largely compatible" 64-bit raspbian kernel packages...

@turmary @Pillar1989 How's Seeed Studio staff getting on with fixing this bug? and the progress with the V2 hardware and driver? This basically confirms that the respeaker driver causes crashes on everything except raspbian 32-bit. That's quite limiting, especially given that people want to use the more recent /powerful pi's in 64-bit mode...

HinTak commented 3 years ago

The earlier tagged release was just the previous attempt - https://github.com/HinTak/RaspberryPi-Dev/tree/Ubuntu-raspi-5.4.0-1026.29 - enough time had passed between me trying, that the Ubuntu people had released another kernel. This first one took over 5 hours of build time with two interruptions; the 1028.31 is continuous over 4 hours. So I'm not likely to do it too often (except a one-off after I upgrade to groovy).

Daenara commented 3 years ago

I still have trouble with my respeaker4 even on 32bit default raspbian. right now I can use one of 4 channels reliably, the rest produce various kinds of noises. I know that it still works with the really old kernel version the official drivers used to enforce, so something is still wrong with the respeaker4 drivers. Right now I am waiting on my new esp32 audio development board to get some kind of mic working for my project, because right now respeaker4 is barely usable.

HinTak commented 3 years ago

@Daenara That's a strange problem - are you sure it is not faulty circuity/electronics? The 7/8th channels on my 6-mics doesn't comes back entirely zeros either, but it is more like a very low level of white noise. I am supposed to have a 2-mics device and a 4-mics device any time soon - got a gift voucher from Seeed Studio, enough to order those two months ago. (they could have just send them to me free, I guess, which would be faster...).

Daenara commented 3 years ago

It works whenever I use the old driver with the kernel downgrade so I do not think it is a hardware issue. When I first started posting here about it, my father and I even checked for hardware issues by measuring every open contact we could find. We couldn't get everywhere because of the way it is build but we could not detect anything in what we could measure.

HinTak commented 3 years ago

Which channel you get useful outcome and what sort of noise do you get on other channels?

Daenara commented 3 years ago

Useful output is on the first channel, everything else just sounds terrible. I do have a sample with all kinds of noises in it, thought that was before I updated to your newest drivers where not even the first channel was usable. It seems that if I use more than one channel, not even the first one is much use so I just turned down all but the first with alsamixer. audio_sample.zip The sample is in a zip because wav isn't on the supported upload list and I didn't want to try converting it to mp4 or mov because I have no idea what that does to my 4 separate channels.

My English is not good enough to accurately describe the noise, so I thought sending over a sample is the best I can do.

HinTak commented 3 years ago

@Daenara I listened to each of the channels - that's so broken. As I mentioned, I should have had one of each of the 2-mics and the 4-mics arriving about 3 weeks ago (it was supposed to have arrived first week of Jan...). There is not much I can do without the actual device... I am fairly sure --compat-kernel has stopped working for some time already, and in general, I would prefer to stay up to date, and recommend the same for others, and wish seeed studio staff makes the effort to keep things up to date... But I could talk you through downgrading the kernel manually and get the older kernel working etc if you like. Write to my email address at the top of git log (inside the repo) if you want some help on that.

Daenara commented 3 years ago

Since just using one channel works for now, I don't think I need to downgrade, I only need one channel for voice recognition anyway. I know you can't do much without the device, I don't expect you to. I just wanted it noted here that there definitely are still issues on 32 bit, not only on 64. Just having the mic running at least somewhat helps me with my tinkering and I am looking for an alternative, at least until this mess of a driver is fixed. (seeed really should get working on it, there is already a decently sized community of a voice assistant that try to tell ppl to not get anything but the 2-mic version because of driver issues, with that kind of a reputation going on for that long, they won't sell many new products)

HinTak commented 3 years ago

It is not hard /time-consuming downgrading or even prepare a fresh raspbian install that is everything up to date, except the kernel; I just prefer not to document the process in the open as it shouldn't be encouraged.

I already wrote above that, seeing how it is, I don't particularly want to look at v2 product, even if it is free. Tinkering with not-working-correctly software/hardware is only rewarding up to a point - or rather, not. Especially when documentation is not available. I refuse to register myself at their supplier's web site to obtain the doc myself.

Daenara commented 3 years ago

It might not be hard or time consuming, but I am pretty scared of breaking something on my testing system which is also my productive system (I use it to switch my wifi sockets via rf remote, since they don't have a remote and wrecking the system basicaly leaves me without an easy way to turn off and on my lights). I don't have a good track record with linux, being a windows user normally and whenever I try something a bit more intrusive than just installing software or updating I break it. Actually, I managed to fry a pi installation once by installing a package, compiled for exactly that version via the official repository. So I feel saver just settling for my system that works right now until my esp32 audio hardware arrives and I can safely stream the audio from it to my pi for my voice assistant.

HinTak commented 3 years ago

@Daenara okay... I have had about 25 years with linux on the desktop/laptop, so mostly pi is just smaller and different hardware for me; if I make a mistake or otherwise want to try dangerous things, I just take the sd card out, insert into my laptop, look at the logs, make some change and put it back into the pi. I can imagine things a bit different if you don't/can't fix things by playing with the sd card'a content elsewhere. But if you do want to downgrade to have a more functional system, let me know. Hopefully the new card arrives soon and/or seeed staffers shape up...

HinTak commented 3 years ago

@Daenara forgotten to ask: while you are recording (I mean comparing before and after) , is there anything unusual in dmesg? You likely have a few "i2s errors" too, I am also interested in how often they happen, so the timestamps before them is useful too, say, while you recording for perhaps 30s / a minute.

Daenara commented 3 years ago

@HinTak I just got around to testing this evening, with only one channel working I got exactly 1 sync error per recording. I did 3 recordings with roughly 15min (needed to record stuff for voice ai training anyways) and one with roughly a minute. When the error happens seems to be random, for one file it was at the start, another was shortly before I stopped the recording, the short one had it in the middle. Here is that part of the log: image

I did not test with all channels working, if you think that can help debug then I can do that this weekend also.

HinTak commented 3 years ago

@Daenara thanks. That's interesting - on the 6-mics, I get a lot more i2s errors, and they always comes in groups of 5-6, not just from recording not also from running aplay -l/arecord -l querying the devices. I don't think a new recording with all channels is needed - you only turned down the volume, right? It is curious as the data is just packed group of 4.(ie you don't have access to individual mics).

My order of a 2-mics and 4-mics was placed just over 2 months ago... Until/unless it arrives, I wonder if I can use just half of the 6-mics device with the 4-mics driver? Cc @turmary @Pillar1989 regarding the order and the possibility of driving half of the 6-mics device with the 4-mics driver.

Daenara commented 3 years ago

Yes, I just turned the volume down. But even one channel seems to still be hit and miss, I just trained a new model, wanted to test it and now I have noise on my one channel also, lets see how many reboots it needs before it works again.

HinTak commented 3 years ago

I mentioned earlier that I'd try to make a work-around kernel for Ubuntu groovy (20.10), so here it is: https://github.com/HinTak/RaspberryPi-Dev/releases/tag/Ubuntu-raspi-5.8.0-1013.16 . Sorry, I know 1015.18 / 1016.19 are already out...

cc @j1nx @j1nx @younes-professor @Daenara @tomh05 @joshuajaharwood @h4de5 @faaafo @Tom-Lu @lxne

I'll write about how it was built at some point. Known differences (besides it does not get crashed /kernel panic by this bug in the respeaker) against genuine 5.8.0-1013.16 from Ubuntu are: CEC GPIO driver, ZFS and the experimental BPF based packet filtering framework . They are either incompatible with the work-around (CEC GPIO and BPF), or out-of-tree (ZFS). The CEC driver seems to be for doing GPIO on special HDMI hardware and most people don't need it; the BPF packet filter is supposed to be still experimental, so the real loss is losing ZFS... but I'm a lot happier to be operational on ubuntu and not be crashed by the respeaker driver. I can be persuaded to try to make ZFS work, if some of you need it...

j1nx commented 3 years ago

@HinTak Thx for the ping. Have been busy with other stuff but see if I can dig into this stuff soon. You are doing great work, so deserve the feedback and the (still by me forgotten) donation.

Without having a deep look as of yet, are the kernel patches in there or do I have to take a look at your repo?

HinTak commented 3 years ago

@j1nx the full story is complicated. The seeed studio repo is up to date to v5.4 . However, raspbian has moved from v5.4 to v5.10 at the beginning of February (2,3 weeks ago). Ubuntu 20.04 LTS is v5.4 based, but ubuntu 20.10 is v5.8 based. So you still need my repo, the v5.8 branch for v5.8, and v5.9 branch for v5.9/v5.10 kernels for up-to-date distributions. Ubuntu 20.04 LTS is still "current" as it is a LTS (long term support) release, so it looks like that might be the best choice for not worrying about misc routine upgrades breaking things.

The ubuntu kernel builds I posted earlier above are for working around this rather serious bug; the actual fix would involve some serious rewrite of part of the driver's logic - I don't have the time for, and also lack of chip/ic documentation (I don't want to register at their supplier's to get those, as a matter of principle...). @Daenara on the 4-mics device seems to have a rather serious issue with usage against newer kernel too - I am supposed to have such a device real soon - put in the order just over 2 months ago, but until/unless it arrives, there is not much I can do.

HinTak commented 3 years ago

@AIWintermuteAI btw, you probably should know that the 4-mics/6-mics respeaker devices crashes 64-bit Raspberrypi OS.

I have a work around for 64-bit ubuntu, but requires rebuilding the kernel: https://github.com/HinTak/RaspberryPi-Dev/releases/tag/Ubuntu-raspi-5.8.0-1013.16 .

Rebuilding the 64-bit kernel package compatible to the Raspberrypi OS way is hard / undocumented. (you can build the kernel alright, but very difficult to build the Rasp OS kernel deb packages the official way)

HinTak commented 3 years ago

@AIWintermuteAI I am commenting on this since you updated the Readme to say "64-bit Raspberry Pi OS" is officially supported :(

ghost commented 2 years ago

@HinTak , This action was performed automatically. Please describe the issue according to bug template - if the issue was resolved, ignore this message. The issue will be marked as closed in 7 days if inactive.

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Platform What platform are you running the code on.

Relevant log output Please copy and paste any relevant log output.

HinTak commented 2 years ago

@AIWintermuteAI I already pinged you on this- the driver code will cause 64-bit raspbian to crash. (and 32-bit / 64-bit ubuntu too). Workaround further up. Just try it on the 64-bit raspbian with anything except the 2-mics device and you'll see.

AIWintermuteAI commented 2 years ago

Hello @HinTak ! Yes, thank you for the reminder. Currently we're working on the issues starting from the most fresh issues and then going to down to staler ones. We will get to 64-bit issue in time as well. Currently this is how resources are allocated to tech support on reSpeaker project.

HinTak commented 2 years ago

@AIWintermuteAI btw, many of the issues (including quite specifically this one) are related to the X-power chips / code, so you should test with the 4-mics/6-mics devices.

As noted in a few other reports, the two mics device has a different vendor and is even supposed to work with upstream latest kernel out-of-the-box without needing to compile and install anything from here.

hellow554 commented 2 years ago

Sadly this is still an issue on the 64-bit version of raspberry pi OS along with the respeaker 4-mic array.

@AIWintermuteAI you removed your assignment on this, does it mean this isn't likely to be fixed? :/

HinTak commented 2 years ago

Yes, PREEMPT_VOLUNTARY is needed for the driver code, so it is a bug... it is not to painful to build your own kernel though...

thetravellor commented 8 months ago

The 6.5 branch installs and works with the current 64 bit bullseye kernel for raspbian 11, using 2 mic pi hat.

Linux openvpn 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux