raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.17k stars 5.01k forks source link

RPi3: not enough DMA channels for SPI+I2S #1327

Closed HiassofT closed 7 years ago

HiassofT commented 8 years ago

I received a report that the Cirrus audio card, which needs both SPI and I2S, doesn't work on the RPi3: http://openelec.tv/forum/124-raspberry-pi/69870-wolfson-audio-card-support?start=270#159249

Looking at dmesg http://sprunge.us/Fjdc it seems like we have the same problem again like last summer (issue #1110) - the kernel is short of one DMA channel and requesting the second DMA channel for bcm2835-i2s fails

[    4.551417] bcm2708-i2s 3f203000.i2s: Missing dma channel for stream: 1

On the RPi1/2 when SPI+I2S is used only one DMA channel is free, and since on RPi3 mmc-bcm2835 seems to want 2 DMA channels as well we have a problem.

As a quick workaround using spi-bcm2708 (which doesn't use DMA) instead of spi-bcm2835 on kernel 4.1 seems to work fine, but it looks we don't have a similar option on kernel 4.4 (or for people who need DMA-supported SPI).

Ideally we'd have at least one more DMA channel available, that would solve this problem.

I don't have my RPi3 yet so currently can't test myself but I think the issue should be easily reproducible with "dtparam=spi=on,i2s=on".

ping @popcornmix @pelwell @msperl @notro

msperl commented 8 years ago

there is one patch that went into the foundation kernels with regards to more dma channels. On top there is now a recently posted patch to dma-engine for upstream, that: a) releases one more dma channel for use (channel 2 if I remember correctly) b) allows us to use dma channel 11 to 14 correctly (by using the shared irq)

HiassofT commented 8 years ago

@msperl Thanks! Could you please post a pointer to your latest patches, they don't seem to have hit the linux-rpi-kernel list

msperl commented 8 years ago

it has not hit the rpi-list yet (probably because of moderation), but it is on the arm list... http://archive.arm.linux.org.uk/lurker/message/20160305.105211.40e3c813.en.html

And there patch2 and patch3: http://archive.arm.linux.org.uk/lurker/message/20160305.105213.f91facb2.en.html http://archive.arm.linux.org.uk/lurker/message/20160305.105214.9c11d4d0.en.html

msperl commented 8 years ago

Note that the main reason seems to be that as of now on a RPI3 we have 4 dma channels already used:

The patchset for upstream gives you instead a grand total of 11 dma channels - leaving 7 for free use. Still there may be some advantage of disabling the Wifi card if it is not used - this may also disable SDHCI...

HiassofT commented 8 years ago

@msperl thanks a lot, your channel 12-14 patches seem to be working fine!

Here's my tree (rpi-4.4.y plus your patches): https://github.com/HiassofT/rpi-linux/tree/dma-channels-12-14

I'll do some more testing and then create a PR.

Just a minor note: in the commit message of the DMA patch and in the DT docs example you have brcm,dma-channel-shared-mask at 0x0780 instead of 0x7800. I copied the DT settings from there and then was wondering for a short moment why it didn't work :)

Having a possibility to disable SDIO/BT/WiFi would certainly be nice.

msperl commented 8 years ago

ok - a typo...

Bt can get disabled via an overlay!

msperl commented 8 years ago

probably sdhost could as well get disabled via a device tree overlay - something along those lines:

/dts-v1/;
/plugin/;

/{
    compatible = "brcm,bcm2708", "brcm,bcm2709", "brcm,bcm2710";

    fragment@0 {
        target = <&sdhost>;
        __overlay__ {
            status = "disabled";
        };
    };
};
msperl commented 8 years ago

Looking at the reason behind this: probably a different option would be disabling dma for spi - especially as - I assume - spi is only used rarely and then only for configuration... (And probably only for transfer sizes that will not be handled via dma anyway)

HiassofT commented 8 years ago

Looking at the reason behind this: probably a different option would be disabling dma for spi - especially as - I assume - spi is only used rarely and then only for configuration... (And probably only for transfer sizes that will not be handled via dma anyway)

I fully agree. Having SPI DMA disabled by default plus a spi-dma overlay like we had some time ago (and is still present in the 4.1 tree as a relic) would be the best solution IMHO. People who want to have DMA on SPI could manually enable it and overlays absolutely requiring SPI DMA could just include the overlay in the dts.

DMA channels are a scarce resource, better not waste them...

Also an overlay/dtparam to disable mmc/sdio on RPi3 when it's not needed would be nice.

@pelwell what do you think?

pelwell commented 8 years ago

Actually it already exists - dtoverlay=sdhost.

HiassofT commented 8 years ago

Actually it already exists - dtoverlay=sdhost.

Thanks @pelwell I completely missed that :(

What's your opinion on the DMA topic?

msperl commented 8 years ago

unfortunately it is not as simple - people with tft displays WANT spi to use DMA. I guess it would require an overlay to make it optional for people with "more complex" needs, who hopefully know what they do...

msperl commented 8 years ago

@pelwell : your overlay is slightly different - use sdhost, but not disable sdhost only while still using mmc for the sdcard (but it - mostly - achives a similar effect)

pelwell commented 8 years ago

Let's enable as many as we can (thanks, @msperl - we should back-port your patch set) and use as few as possible.

Something else to bear in mind - the sdhost driver has a PIO-only mode. Add dtparam=sd_force_pio (once you've grabbed and built the very latest source - it was missed out of the initial Pi3 DTB).

msperl commented 8 years ago

just wait until it gets accepted upstream...

pelwell commented 8 years ago

@msperl I think dtoverlay=sdhost does what @HiassofT asked for ("an overlay/dtparam to disable mmc/sdio on RPi3 when it's not needed"). If instead you would rather use the original mmc interface for SD card and no SDIO then use dtoverlay=mmc.

And yes, we are waiting for the patches to be accepted upstream.

HiassofT commented 8 years ago

unfortunately it is not as simple - people with tft displays WANT spi to use DMA.

Sure, in these cases the tft overlay could simply include the spi-dma overlay - just like sdio is doing with sdhost. dtc seems to handle that well

https://github.com/raspberrypi/linux/blob/rpi-4.1.y/arch/arm/boot/dts/overlays/sdio-overlay.dts#L3

But now that I think about it a use_dma property for the spi driver together with a dtparam might actually be the better solution.

Edit: I meant to write use_dma, not force_pio

pelwell commented 8 years ago

It seems like a shocking waste that the mmc and sdhost drivers each use two channels when data flow is always in one direction at a time; the two channels even use the same DREQ. There has to be a better way.

msperl commented 8 years ago

@pelwell : there is that as well - maybe the same channel can get used for both rx/tx

The other approach would be to totally rewrite the DMA engine so that we see all "normal" and all "lite" channels as a single pool and we dispatch the control-blocks to any of those - then we could overcommit on dma's...

But getting that upstream may still be more complex...

pelwell commented 8 years ago

It looks to be straightforward - just allocate one channel and use dma_slave_config before dmaengine_slave_sg. There shouldn't be a performance penalty - it amounts to a few procedure calls and a memcpy of a small structure for each transfer.

Watch this space.

msperl commented 8 years ago

a quick look at bcm2835_mmc.c shows that - afaik - it should be possible to fix bcm2835_mmc_transfer_dma to use only one channel by fixing dma_chan to a single channel. The biggest problem here is that we then have to: a) reconfigure the channel every time we use it (dmaengine_slave_config and the code before) b) we have to set up dreq manually and not via the device tree - unless there is a way...

As for the second: as an idea: we could:

in the driver right now the slave id should not be set - that is the task of the device-tree...

pelwell commented 8 years ago

Why can't we use DT for the slave id/DREQ? Just have a single entry in dmas - call it "rx-tx". There are precedents.

msperl commented 8 years ago

simply because Ah forgot: both dma channels use the same DREQ - not different ones - then it becomes easier and you can forget my 2nd point. Only thing left is point a) where you have to reconfigure the channel on every use

pelwell commented 8 years ago

The reconfiguration only amounts to a small memcpy, so I don't see it as a problem.

See https://github.com/raspberrypi/linux/pull/1329 (a PR against rpi-4.4.y)

Here's what I got from two runs of iozone (iozone -e -I -a -s 50M -r 4k -r 512k -r 16M -i 0 -i 1 -i 2):

2 DMA channels:
                                                              random    random
              kB  reclen    write  rewrite    read    reread    read     write
           51200       4     2319     2687     8791     8783     7851     1146
           51200     512     8554     7627    32477    32405    32424     4113
           51200   16384    16254    14633    33464    33514    33295    14781

1 DMA channel:
                                                              random    random
              kB  reclen    write  rewrite    read    reread    read     write
           51200       4     2212     2273     8806     8841     8657     1199                                              
           51200     512     9285     9993    32490    32559    32246     5011
           51200   16384    17153    14446    33450    33500    33441    14868

I think any real difference is drowned in the noise. I doubt you'll be able to measure any performance difference.

msperl commented 8 years ago

look good - the performance hit is negligible and will drown out in noise.

pelwell commented 8 years ago

By the way, those figures are with sd_overclock=80 on a Pi3.

edo1 commented 8 years ago

I think any real difference is drowned in the noise.

Wouldn't be difference in multi-threaded mode?

msperl commented 8 years ago

@popcornmix @pelwell : one Idea: I2s for lots of usecases only requires 1 channel: playback.

So we would only need one channel by default (assuming the driver allows it).

That would further reduce the number of dma channels for lots of people.

This could easily get implemented via DT overlays.

Just one thought.

As for the patch to enable channels 11-14 there is now hopefully a (almost) final patch that has some ack from 2 Device Tree maintainers, so it seems much more likley that this gets accepted...

What the patch essentially does is apply a "interrupts-name" property with "dma0" ... "dma14" and for DMA channels 11 to 14 we use the same irq - the "shared" irq is automatically detected.

pelwell commented 8 years ago

We've already gained two channels from the MMC driver changes, and we are about to gain three more with channels 12-14. If there are simple changes to the DT and/or overlays that save more channels then lets make them, but I suspect we now have enough for most use cases.

msperl commented 8 years ago

Note that there is now a second iteration of a patch that may now get included upstream. It got acks for everything already by Eric and Marc Rutland (for arm) as well as from Rob Herring for the DT Portion - which is a requirement by Vinod to get merged.

So this should finally allow us to use channels 11-14 it in 4.7. Let us see when this gets really added...

lategoodbye commented 7 years ago

Any suggestions how this should be handled upstream? Gerd Hoffmann wants to submit bcm2835-sdhost and this seems to be related.

msperl commented 7 years ago

AFAIK the DMA driver downstream is (almost) identical to upstream. So it should be possible to use the same principles there as well.

Also we now have 10 or 11 dma channels available so right now there should not be an issue with the amount of channels available.

Afair upstream right now just uses dma for:

So just adding the following should be sufficient:

                mmc@7e300000 {
                        dmas = <&dma 11>;
                        dma-names = "rx-tx";
                };
                sdhost@7e202000 {
                        dmas = <&dma 13>;
                        dma-names = "rx-tx";
                };

Note that downstream seems to have a patch to only use a single dma channel not 2 for sdhost! the submitted driver possibly still requires 2 channels for both directions (at least as far as I read the dts patch)

lategoodbye commented 7 years ago

Thanks. The sdhci in upstream (mmc in downstream) has no DMA channels defined in DT and the sdhci-iproc driver isn't capable of external DMA. So i think we better leave the sdhci part as it is and change only the sdhost entry in the patch series (incl. requesting only 1 channel at a time in the driver code).

JamesH65 commented 7 years ago

@pelwell Have read the comment trail, still no idea whether this can be closed!

pelwell commented 7 years ago

As Martin said:

Also we now have 10 or 11 dma channels available so right now there should not be an issue with the amount of channels available.

So I think we're good.

Note that downstream seems to have a patch to only use a single dma channel not 2 for sdhost! the submitted driver possibly still requires 2 channels for both directions (at least as far as I read the dts patch)

My shared "rxtx" dma patch has been incorporated into the upstream driver, as of rpi-4.12.y, so that saves another channel. I think we can close it.