raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
10.98k stars 4.94k forks source link

ARCH_BCM270x: Moving more devices to DT #936

Closed notro closed 9 years ago

notro commented 9 years ago

This is a list of the devices in arch/arm/mach-bcm2709/bcm2709.c that isn't loaded through Device Tree:

I have made a patch for the following devices: bcm2708_fb_device, bcm2708_usb_device, bcm2708_alsa_devices[0-7], bcm2835_thermal_device Patch: https://gist.github.com/notro/882b88430626269d0572 (the patch is for Pi2 against rpi-4.0.y)

Audio: Should we use 2708 in the compatibility string to avoid a possible clash with a future mainline driver?

    audio@0 {
        compatible = "bcrm,bcm2835-audio";
        compatible = "bcrm,bcm2708-audio";
    };

Are these in use: bcm2708_systemtimer_device, bcm2708_powerman_device ?

I can make a PR if this is useful. Comments are welcome.

notro commented 9 years ago

I now have the dwc_otg driver working (no fiq) with ARCH_BCM2835, well partly #937 It will be interesting to see how many of the other drivers I can get working.

popcornmix commented 9 years ago

Sounds good to me.

notro commented 9 years ago

How do you want it:

  1. One commit per driver and one commit for all the devices
  2. One commit per driver and one commit per device
popcornmix commented 9 years ago

Not too bothered. I'll say one commit per device as it's easier to squash than split.

notro commented 9 years ago

AFAICT I now have the major 2708 drivers (mostly) working on ARCH_BCM2835! This was really fun to pull through. I have tried a couple of times over the last year to do this, but each time dwc_otg would hang on reset and I gave up. I haven't done much testing, but all drivers have signs of "life".

dwc_otg

No FIQ support which leads to broken low speed device support. Network OK:

$ ping -c 1 raspberrypi.org
PING raspberrypi.org (149.126.74.185) 56(84) bytes of data.
64 bytes from 149.126.74.185.ip.incapdns.net (149.126.74.185): icmp_req=1 ttl=51 time=33.5 ms

--- raspberrypi.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 33.592/33.592/33.592/0.000 ms

When trying to boot directly from VC without u-boot, it breaks:

[    1.827405] dwc_otg: version 3.00a 10-AUG-2012 (platform bus)
[    2.033412] Core Release: 2.80a
[    2.036697] Setting default values for core params
[    2.041570] Finished setting default values for core params
[    2.060618] WARN::dwc_otg_core_reset:5109: dwc_otg_core_reset() HANG! Soft Reset GRSTCTL=80000001
[    2.060618]
[    2.183995] WARN::dwc_otg_core_reset:5109: dwc_otg_core_reset() HANG! Soft Reset GRSTCTL=80000001
[    2.183995]

bcm2708_dma

Copied to drivers/dma/bcm2708-dma.c

bcm2835-mmc

$ sync; time dd if=/dev/zero of=~/test.tmp bs=500K count=1024
1024+0 records in
1024+0 records out
524288000 bytes (524 MB) copied, 36.123 s, 14.5 MB/s

real    0m36.143s
user    0m0.020s
sys     0m15.160s

bcm2708_vcio

Copied to drivers/mailbox/bcm2708-vcio.c

bcm2835-thermal

$ cat /sys/class/thermal/thermal_zone0/temp
35780

bcm2708_fb

HDMI is ok. Not sure if the DMA part is tested by just starting X windows. Based on work by Lubomir Rintel.

vchiq

$ vcgencmd measure_temp
temp=35.8'C

Based on work by Lubomir Rintel.

bcm2835_AUD0

omxplayer crashed the kernel (8 blinks on the green led). aplay gives me mono sound:

$ aplay /usr/share/sounds/alsa/Front_Center.wav
Playing WAVE '/usr/share/sounds/alsa/Front_Center.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Mono

vc_mem

Copied to drivers/char/broadcom/vc_mem.c

$ sudo vcdbg log msg 2>&1 | grep Loading
000913.584: Loading 'kernel.img' from SD card

Here's the diff: https://gist.github.com/notro/dbb63a021d8ea575ddb5 Kernel boot messages: https://gist.github.com/notro/142ac6e5caf1631cc639

@popcornmix are you interested in pulling something like this?

popcornmix commented 9 years ago

Sounds awesome! Yes, this is something I'm very interested in. Pi1 only for now I guess? What are you using as the .config? bcm2835_defconfig?

[ 0.272560] DMA: preallocated 256 KiB pool for atomic coherent allocations looks a bit low for vchiq (and possible dwc).

Any thoughts on why uboot is required? Is uboot doing some reset/init/power/clock setup of dwc that is not occurring with 2835 arch? Would be good to get this solved. If the FIQ not working another init step missing, or not supported in interrupt driver?

notro commented 9 years ago

Pi1 only for now I guess?

Yes. I see that Eric Anholt is working on getting Pi2 support into mainline, so when that happens, I guess this should work on the Pi2 as well.

I don't know why this is address bit must be set though, it might be different on the Pi2: bcm2708_fb

#ifdef CONFIG_ARCH_BCM2835
#define TO_VC_PHYS(a)   (0x40000000 | (a))
    bcm_mailbox_write(MBOX_CHAN_FB, TO_VC_PHYS(fb->dma));
#else
    bcm_mailbox_write(MBOX_CHAN_FB, fb->dma);
#endif

What are you using as the .config? bcm2835_defconfig?

Yes. Should we make changes to it or should we make a new config for this work? I'm planning to add all the modules from bcmrpi_defconfig when I'm done with these drivers.

[ 0.272560] DMA: preallocated 256 KiB pool for atomic coherent allocations looks a bit low for vchiq (and possible dwc).

Yes, I got an allocation failure in vchiq at first:

    g_slot_mem = dma_alloc_coherent(NULL, g_slot_mem_size + frag_mem_size,
        &g_slot_phys, GFP_ATOMIC);

I couldn't see why this needed to be atomic (during init/probe), so I changed it to GFP_KERNEL. It worked fine after that. Is 256 KiB too small for dwc_otg?

Any thoughts on why uboot is required? Is uboot doing some reset/init/power/clock setup of dwc that is not occurring with 2835 arch? Would be good to get this solved.

I didn't do arch/arm/mach-bcm2708/power.c, because of this statement: This driver provides a mailbox interface to power up/down the domains that notionally belong to the ARM, however currently this is not used by any of the peripheral drivers. But looking at it, I see:

#ifdef CONFIG_USB
#define BCM_POWER_ALWAYS_ON (BCM_POWER_USB)
#endif

static int __init bcm_power_init(void)
{
[...]
#if defined(BCM_POWER_ALWAYS_ON)
    if (BCM_POWER_ALWAYS_ON) {
        bcm_power_open(&always_on_handle);
        bcm_power_request(always_on_handle, BCM_POWER_ALWAYS_ON);
    }
#endif

So USB power seems to be the missing piece. I'll try it.

If the FIQ not working another init step missing, or not supported in interrupt driver?

Not supported in the irq driver. @P33M made this comment:

Regarding FIQ support - the mach-bcm2835 armctrl IRQchip needs expanding to write to the FIQ select register. FIQ lines in ARM linux are just IRQ numbers >= FIQ_START. Some sort of refcounting would be required to ensure that the FIQ is not enabled from different sources simultaneously - the hardware can only support a single, nominated IRQ line to promote to the FIQ input.

popcornmix commented 9 years ago
#ifdef CONFIG_ARCH_BCM2835
#define TO_VC_PHYS(a)   (0x40000000 | (a))
    bcm_mailbox_write(MBOX_CHAN_FB, TO_VC_PHYS(fb->dma));
#else
    bcm_mailbox_write(MBOX_CHAN_FB, fb->dma);
#endif

Looks wrong for 2836. I suspect this offset should be: https://github.com/raspberrypi/linux/blob/rpi-3.18.y/arch/arm/mach-bcm2709/include/mach/memory.h#L39

so wants to be 0xC0000000 (bypassing the GPU caches) on 2836.

I think modifying the upstream bcm2835_defconfig is okay. We are modifying other upstream files, so trying to build an clean upstream kernel from this tree is not advisable.

notro commented 9 years ago

bypassing the GPU caches

Then why is this not needed on ARCH_BCM2708?

Isn't there some dma sync function that we can use instead of bypassing the cache?

notro commented 9 years ago

Is USB the only peripheral that the firmware doesn't power on? If so, why can't it turn on that as well?

popcornmix commented 9 years ago

Read this message for details on the different cache aliases: http://lists.denx.de/pipermail/u-boot/2015-March/208201.html

The bcm2708_fb driver on Pi2 reports: [ 1.552500] BCM2708FB: allocated DMA memory e7c00000

and on Pi1 reports: [ 1.377893] BCM2708FB: allocated DMA memory 4fc00000

so it is allocing a bus address suitable for use by GPU. We convert this back into an ARM physical address here: https://github.com/raspberrypi/linux/blob/rpi-3.18.y/drivers/video/fbdev/bcm2708_fb.c#L316

popcornmix commented 9 years ago

Originally I believe MMC and USB required powering from the arm side. I think the linux driver powered "on demand" which slowed down the MMC driver and meant if the GPU locked up you lost access to MMC which was unfortunate, so that got removed.

Removing power control of USB and making it always on and powered by the GPU may make sense. It rules out a (very small) power saving possible from the arm side, but we're not taking advantage of that. @pelwell @P33M any objection to enabling USB power from GPU by default?

P33M commented 9 years ago

I agree that the use case where we would want USB turned off is of marginal benefit - the power consumption versus connectivity tradeoff doesn't really make sense.

As it removes some complexity (for now) from the ARM side code then it seems reasonable to just change the default state in firmware, but in the future we would want the state to be communicated to whatever PM/power domain driver that exists in the upstream world.

pelwell commented 9 years ago

No objection from me.

popcornmix commented 9 years ago

@notro can you post you config? I suspect you are not using an untouched bcm2835_defconfig as that doesn't have CONFIG_MAILBOX enabled.

notro commented 9 years ago

Sorry I forgot that, I set the config in a script. Here's the diff: https://gist.github.com/notro/5db4d2c30996c1b58270 I don't know why some of the options are removed though, the only one I actively disabled was FB_SIMPLE.

popcornmix commented 9 years ago

@notro I just get Uncompressing Linux... done, booting the kernel. and nothing else. I'm just trying to get the direct (no uboot) case to work. Do you have any config.txt/cmdline.txt settings? Did you run mkknlimg --dtok on the kernel?

notro commented 9 years ago

There is something wrong with that diff I gave, because some of the options are not disable in the kernel:

pi@raspberrypi:~$ zgrep "CONFIG_RD_" /proc/config.gz
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y

I'll have to look into this.

notro commented 9 years ago

These are the options I set/change in my script:

  VAR['LINUX_DEFCONFIG'] = 'bcm2835_defconfig'

  config 'DYNAMIC_DEBUG', :enable
  config ['CONFIG_IKCONFIG', 'CONFIG_IKCONFIG_PROC'], :enable
  config 'PROC_DEVICETREE', :enable

  config ['DMADEVICES', 'DMA_BCM2708'], :enable
  config 'MMC_BCM2835', :enable
  config 'MMC_BCM2835_DMA', :enable

  config 'USB_DWCOTG', :enable

  config 'THERMAL', :enable
  config 'THERMAL_BCM2835', :enable

  config 'MAILBOX', :enable
  config 'BCM2708_MBOX', :enable

  config 'FB_BCM2708', :enable
  config 'FB_SIMPLE', :disable

  config 'SOUND', :enable
  config 'SND', :enable #:module
  config 'SND_BCM2835', :enable #:module

  config 'BRCM_CHAR_DRIVERS', :enable
  config 'BCM2708_VCMEM_2', :enable
popcornmix commented 9 years ago

Can you run make ARCH=arm savedefconfig and post the defconfig file?

Do you have any config.txt/cmdline.txt settings? Did you run mkknlimg --dtok on the kernel?

notro commented 9 years ago

Am I doing something wrong here:

$ grep "CONFIG_RD_" arch/arm/configs/bcm2835_defconfig
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y

$ grep "CONFIG_RD_" .config
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y

$ ARCH=arm make savedefconfig
$ grep "CONFIG_RD_" defconfig
$ #nothing

defconfig: https://gist.github.com/notro/62767303672ebba49161 .config: https://gist.github.com/notro/5fabf78966d91eeec205

Do you have any config.txt/cmdline.txt settings?

No

Did you run mkknlimg --dtok on the kernel?

Yes when I tested booting directly without uboot.

notro commented 9 years ago

Do you have any config.txt/cmdline.txt settings?

Sorry, /boot/config.txt: device_tree=bcm2835-rpi-b-plus.dtb

popcornmix commented 9 years ago

Am I doing something wrong here:

The defconfigs only list non-default options, so it's expected to have more options in .config than in the original defconfig. Now if a newly created defconfig has even fewer options, then that suggests the original defconfig had some invalid (or already default) options in that have been removed.

notro commented 9 years ago

How do we treat the arch modules: dma.c, vc_mem.c, vcio.c? vc_mem can be just moved as is without any changes. It gets it's config from the kernel command line. The other two, should we move them or make a copy just for ARCH_BCM2835? vcio.c for instance hardcodes mem and irq, so if we move it that would have to be taken care of for 270x. dma.c hardcodes it's irqs so that would also need to be fixed for 270x.

If we don't move, the depending drivers would need different includes for ARCH_BCM2835.

What we choose depends on how long until we reach ARCH_BCM2835, and the expected future changes to these drivers. If we have copies, both would need updating. From a purist point of view having 2 almost identical drivers is an abomination, but we are quite pragmatic in this remote part of the kernel world :-) And it is a transitional phase.

So what do you think?

notro commented 9 years ago

It doesn't sound right to make copies of the drivers... sigh. Maybe I just have to take the time to do it "right". It's the testing that takes so much time, 2 kernels x 2 modes + 2835 kernel. While taking care not to break anything in between these commits...

dma PR with 7 commits:

pelwell commented 9 years ago

I think we need to move to using platform devices to configure the existing hardcoded drivers. We should also - either at the same time or later - remove the 2708/2709 duplication.

notro commented 9 years ago

Yes, perhaps I should first make a PR that fixes the hardcoding and then do one for ARCH_BCM2835. That will make it easier on me :-)

BCM270x: Add irq resources to dmaman device BCM270x: dma: Use irq resources instead of hardcoded values BCM270x: dma: Add Device Tree support BCM270x: Add dma Device Tree node and use it

dma: Add bcm2708 legacy driver (plain copy of dma.c) BCM270x: Use drivers/dma/bcm2708-dma.c BCM270x: dma: Remove driver bcm2835: Add legacy DMA DT node bcm2835: Enable DMA in defconfig

Another solution is to merge dma.c with bcm2708-dmaengine.c:

BCM270x: Add memory and irq resources to dmaengine device and DT dmaengine: bcm2708: Merge with arch dma.c driver and disable dma.c BCM270x: Remove dmaman device BCM270x: dma: Remove driver

dmaengine: bcm2708: Add depends on ARCH_BCM2835 bcm2835: Enable DMA_BCM2708 in defconfig

That will give us just one device/driver for the DMA controller. Maybe this one is better?

notro commented 9 years ago

Patch merging dma.c with bcm2708-dmaengine.c: https://gist.github.com/notro/656e0c483089a0bf4138 Please have a look and tell me what you think.

Changes to the dma.c code during merging: Fix a couple of whitespace issues. Cutdown some comments to one line. Add mutex to vc_dmaman and use this, since the dev lock is locked during probing of the engine part. Add global g_dmaman variable since drvdata is used by the engine part. Restructure bcm_dma_chan_alloc() to simplify error handling. Use device irq resources instead of hardcoded bcm_dma_irqs table. Remove dev_dmaman_register() and code it directly. dev_dmamanderegister() and code it directly. Get dmachans from DT if available. Use dev* instead of printk. Keep 'dma.dmachans' module property name for backwards compatibility.

I'm uncertain about dmachans. The firmware only sets the module parameter and not the DT property. Should I skip DT support, or should we add support in firmware?

popcornmix commented 9 years ago

I guess firmware should set dmachans in DT. @pelwell?

pelwell commented 9 years ago

Let's leave this in as a default to aid the transition, then I'll patch it from the firmware.

pelwell commented 9 years ago

The rest of the patch looks great. I just have a few trivial comments, most of which apply to the original code:

162: "shotcut"

228: bcm_dma_wait_idle is extern but has no EXPORT_SYMBOL_GPL

237: bcm_dma_start's EXPORT_SYMBOL_GPL does not immediately follow bcm_dma_start

250: Tab in comment

365: Can we have a new variable - unsigned int chan, say - that can be assigned the result of a successful vc_dmaman_chan_alloc and used on subsequent lines instead of rc? It probably won't change the generated code, but it would make the rest of the code clearer.

431: Let's move the assignment of the fallback dmachans default (DEFAULT_DMACHAN_BITMAP) to this "else if" clause, then we can remove the conditional on line 462 (and it's "else" clause).

notro commented 9 years ago

Thanks for your comments Phil.

popcornmix commented 9 years ago

BTW, we couldn't think why vchiq allocation needed to be atomic, so I've changed that.

notro commented 9 years ago

Mailbox

I'm planning to change the arch/arm/mach-bcm2708/vcio.c in place first, then move it unchanged. This way I can do the changes in steps making it easier to review (keep checkpatch fixes in a commit of it's own for instance).

Please have a look an see if something obvious jumps out: https://gist.github.com/notro/bc1a204f5d3fe11454a6 (no checkpatch fixes yet)

Is this macro name and assignment ok?

#define ARMCTRL_0_MAIL0_BASE     (ARMCTRL_0_SBM_BASE + 0x80)  /* User 0 (ARM)'s Mailbox 0 */

I access the mail1_wr register from the mail0 base. Hope that is ok.

Does anyone see what the purpose of this is? This is the only place it's referenced except for teardown.

    vcio_class = class_create(THIS_MODULE, DRIVER_NAME);
    device_create(vcio_class, NULL, MKDEV(MAJOR_NUM, 0), NULL, "vcio");

The returned device structure is just thrown away.

notro commented 9 years ago

I see that the class_create, device_create pattern is used elsewhere in the kernel as well.

I'm wondering if this should be in the module init function:

    ret = register_chrdev(MAJOR_NUM, DEVICE_FILE_NAME, &fops);
    vcio_class = class_create(THIS_MODULE, DRIVER_NAME);
    device_create(vcio_class, NULL, MKDEV(MAJOR_NUM, 0), NULL, "vcio");

Because an ioctl on this file ends up in bcm_mailbox_write() which checks mbox_dev, and returns -ENODEV if the driver hasn't been probed. In that case, the teardown was correctly placed in module exit().

notro commented 9 years ago

According to http://stackoverflow.com/questions/5970595/create-a-device-node-in-code, device_create() gives udev the information needed to create the device node.

notro commented 9 years ago

Am I correct in assuming that this relates to code that has been stripped out, and is no longer necessary?

#if defined(CONFIG_SERIAL_BCM_MBOX_CONSOLE) && defined(CONFIG_MAGIC_SYSRQ)
#define SUPPORT_SYSRQ
#endif

#include <linux/console.h>
#include <linux/serial.h>
#include <linux/serial_core.h>
#include <linux/sysrq.h>

A google search for SERIAL_BCM_MBOX_CONSOLE yields nothing.

popcornmix commented 9 years ago

SERIAL_BCM_MBOX_CONSOLE was a hacky debug feature that provided a virtual serial device that came out through the mailbox and was output from GPU logging. It hasn't been used since hardware uarts were made to work, so please feel free to remove.

The device node creation came from a PR (not accepted at the time, but cherry-picked into a newer linux branch). Feel free to move/edit if you believe it is an improvement (as long as /dev/vcio is still created).

notro commented 9 years ago

bcm2708_fb

I read your reply to the u-boot ML. It was very detalied, but I lack knowledge about the basics in this area.

To get the driver working on ARCH_BCM2835, I have to do this:

static int bcm2708_fb_set_par(struct fb_info *info)
{

    /* inform vc about new framebuffer */
pr_info("fb->dma=%pad\n", &fb->dma);
#ifdef CONFIG_ARCH_BCM2835
#define TO_VC_PHYS(a)   (0x40000000 | (a))
pr_info("TO_VC_PHYS(fb->dma)=0x%lx\n", (unsigned long)TO_VC_PHYS(fb->dma));
    bcm_mailbox_write(MBOX_CHAN_FB, TO_VC_PHYS(fb->dma));
#else
    bcm_mailbox_write(MBOX_CHAN_FB, fb->dma);
#endif

If I skip that offset, it stops working.

See the difference in offset between the dma adresses:

MACH_BCM2708
[    1.376474] fb->dma=0x5bc10000

ARCH_BCM2835
[    0.986179] fb->dma=0x1b617000
[    0.989267] TO_VC_PHYS(fb->dma)=0x5b617000

This is where fb->dma is assigned:

static int bcm2708_fb_register(struct bcm2708_fb *fb)
{
    mem =
        dma_alloc_coherent(NULL, PAGE_ALIGN(sizeof(*fb->info)), &dma,
                   GFP_KERNEL);

        fb->info = (struct fbinfo_s *)mem;
        fb->dma = dma;

2835: first 32 pages map physical addresses 0x00000000-0x1fffffff to bus addresses 0x40000000-0x5ffffffff.

It seems dma_alloc_coherent() sets a wrong bus address on ARCH_BCM2835?

From https://www.kernel.org/doc/Documentation/DMA-API.txt:

void *
dma_alloc_coherent(struct device *dev, size_t size,
                 dma_addr_t *dma_handle, gfp_t flag)
[...]
It also returns a <dma_handle> which may be cast to an unsigned integer the
same width as the bus and given to the device as the bus address base of
the region.
notro commented 9 years ago

This is why I get a physical address instead of a bus address (CONFIG_NEED_MACH_MEMORY_H is not defined): arch/arm/include/asm/memory.h

#ifdef CONFIG_NEED_MACH_MEMORY_H
#include <mach/memory.h>
#endif

[...]

/*
 * Virtual <-> DMA view memory address translations
 * Again, these are *only* valid on the kernel direct mapped RAM
 * memory.  Use of these is *deprecated* (and that doesn't mean
 * use the __ prefixed forms instead.)  See dma-mapping.h.
 */
#ifndef __virt_to_bus
[...]
#define __pfn_to_bus(x) __pfn_to_phys(x)

So by default on ARCH_BCM2835, bus addresses is the same as physical addresses.

After wading through the kernel source for some time, I thought I found a way using the dma-ranges DT property. But I could not get it to work.

Finally I discovered why it didn't work: Revert "ARM: dma: Use dma_pfn_offset for dma address translation"

I will continue tomorrow and revert that for ARCH_BCM2835 and see if dma-ranges can help with my missing offset.

notro commented 9 years ago

Now I get the correct bus address.

Added dma-ranges property:

diff --git a/arch/arm/boot/dts/bcm2835.dtsi b/arch/arm/boot/dts/bcm2835.dtsi
index 06cba29..c51ecd4 100644
--- a/arch/arm/boot/dts/bcm2835.dtsi
+++ b/arch/arm/boot/dts/bcm2835.dtsi
@@ -14,6 +14,7 @@
                #address-cells = <1>;
                #size-cells = <1>;
                ranges = <0x7e000000 0x20000000 0x02000000>;
+               dma-ranges = <0x40000000 0x0 0x40000000>;

                timer@7e003000 {
                        compatible = "brcm,bcm2835-system-timer";

Call chain that sets up the offset: of_platform_populate() -> of_platform_bus_create() -> of_platform_device_create_pdata() -> of_dma_configure() -> of_dma_get_range() The offset is stored in: dev->dma_pfn_offset

Provide dma_alloc_coherent() with the device so we get the bus offset applied:

diff --git a/drivers/video/fbdev/bcm2708_fb.c b/drivers/video/fbdev/bcm2708_fb.c
index 345c15e..f6ac7da 100644
--- a/drivers/video/fbdev/bcm2708_fb.c
+++ b/drivers/video/fbdev/bcm2708_fb.c
@@ -628,7 +625,7 @@ static int bcm2708_fb_register(struct bcm2708_fb *fb)
        void *mem;

        mem =
-           dma_alloc_coherent(NULL, PAGE_ALIGN(sizeof(*fb->info)), &dma,
+           dma_alloc_coherent(&fb->dev->dev, PAGE_ALIGN(sizeof(*fb->info)), &dma,
                               GFP_KERNEL);

        if (NULL == mem) {

This is what my arch/arm/include/asm/dma-mapping.h looks like:

#ifndef __arch_pfn_to_dma
static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
{
    if (IS_ENABLED(CONFIG_ARCH_BCM2835) && dev)
        pfn -= dev->dma_pfn_offset;
    return (dma_addr_t)__pfn_to_bus(pfn);
}

static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
{
    unsigned long pfn = __bus_to_pfn(addr);

    if (IS_ENABLED(CONFIG_ARCH_BCM2835) && dev)
        pfn += dev->dma_pfn_offset;

    return pfn;
}

static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
{
    if (IS_ENABLED(CONFIG_ARCH_BCM2835) && dev) {
        unsigned long pfn = dma_to_pfn(dev, addr);

        return phys_to_virt(__pfn_to_phys(pfn));
    }

    return (void *)__bus_to_virt((unsigned long)addr);
}

static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)
{
    if (IS_ENABLED(CONFIG_ARCH_BCM2835) && dev)
        return pfn_to_dma(dev, virt_to_pfn(addr));

    return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
}

Maybe using #ifdef is the canonical way of doing this, I don't know.

@popcornmix can you fix up your commit to stay clear of ARCH_BCM2835?

What kind of problem did your patch solve on MACH_BCM270x?

And I wonder about another thing, why could the mmc driver work using DMA on ARCH_BCM2835 when it got physical and not bus addresses? The same goes for the SPI DMA work Martin is doing also, it does work without dma-ranges

pelwell commented 9 years ago

And I wonder about another thing, why could the mmc driver work using DMA on ARCH_BCM2835 when it got physical and not bus addresses?

The BCM2835_VCMMU_SHIFT macro holds the offset between the two.

notro commented 9 years ago

Isn't that the hardware device's bus address? I was thinking about the bus address for the dma buffers used to copy to/from that device.

popcornmix commented 9 years ago

Have another read of: https://www.raspberrypi.org/wp-content/uploads/2012/02/BCM2835-ARM-Peripherals.pdf and also my u-boot ML post. Bus addresses use the top two address bits to specify the GPU caching modes. Pi1/2835 uses the GPU's L2 cache, so bus addresses are always 0x4 alias (i.e. from 0x40000000-0x5fffffff). 0x40000000 will correspond with arm physical address 0x0000000. Pi2/2836 does not use the GPU's L2 cache, so bus addresses are always 0xC alias (i.e. from 0xC0000000-0xDfffffff). 0xC0000000 will correspond with arm physical address 0x0000000.

On mach-bcm2708/mach-bcm2709, _REAL_BUS_OFFSET (arch/arm/mach-bcm2708/include/mach/memory.h) contains the offset needed to convert from physical to bus addresses.

notro commented 9 years ago

I'm starting to wrap my head around this now.

Pi2/2836 does not use the GPU's L2 cache, so bus addresses are always 0xC alias

I understand this also now as I see that BCM2708_NOL2CACHE is the default in arch/arm/mach-bcm2709/Kconfig as opposed to 2708.

notro commented 9 years ago

I'm going to put together a PR for ARCH_BCM2835 support in bcm2708_fb.

To do this I need to fix this revert so it doesn't apply on ARCH_BCM2835: Revert "ARM: dma: Use dma_pfn_offset for dma address translation"

@popcornmix Do you want me to apply a patch on top of this commit, or do you want me to revert it and make a new commit? This way you can end up with only one commit for this change, instead of two, the next time you rebase.

If you want me to revert it, what reason should I add in the following commit? The revert commit doesn't give a reason for why it was necessary.

popcornmix commented 9 years ago

It would be nice to understand what the correct fix for "ARM: dma: Use dma_pfn_offset for dma address translation" is. This is what happened:

Kernel bump to 3.16.y

However although usb works (keyboard working), I get no IP address. Also it hangs after a minute or two (no panic). Sysreq shows it is always here:

[<c0388424>] (__skb_clone) from [<c02f59b4>] (smsc95xx_rx_fixup+0x8c/0x250)
[<c02f59b4>] (smsc95xx_rx_fixup) from [<c02f7e64>] (usbnet_bh+0xe8/0x26c)
[<c02f7e64>] (usbnet_bh) from [<c0022e30>] (tasklet_action+0x74/0xcc)
[<c0022e30>] (tasklet_action) from [<c00230ac>] (__do_softirq+0xa4/0x1f8)
[<c00230ac>] (__do_softirq) from [<c0023474>] (irq_exit+0x98/0xf0)

I'm suspecting a commit like:
http://patchwork.ozlabs.org/patch/358917/

but reverting that didn't help. Feels like an upstream bug.

I tried disabling FIQ with no change in behaviour. Simlar network failures on 2836.

Reverted all commits between 3.15.y and 3.16.y in drivers/usb/net with no effect.

I spent a little while finding a small set of patches that were enough to show to the bug (the main port, the dwc one, the fiq one, and a couple of 2836 ones).
I think 6 were enough to boot and see the problem (or the success). I squashed that to a single patch.

The scheme is at each stage:
git am -3 ~/hide/2836-mini.patch
[fix conflicts]
make ARCH=arm -j10 zImage
[fix build errors]
cp  arch/arm/boot/zImage  /media/dc4/0C50-6143/kernel7.img
git reset --hard HEAD~
git bisect good/bad

Seems to be working. Through about 6 bisects (of 14!). Hopefully it will converge...

I've enabled "rerere", although I'm getting more grief from api changes that give a build error than merge conflicts

Aha! Looks like this is the culprit: http://permalink.gmane.org/gmane.linux.kernel.commits.head/452918

So 3.16 made network unusable due to that commit. Reverting it fixes the problem, but I suspect the correct fix is to make dev->dma_pfn_offset do the same thing as __pfn_to_bus(pfn) (so having the commit in or reverted would make no difference), but I haven't dug deep enough to find out what the correct behaviour here is.

notro commented 9 years ago

My findings so far:

pfn_to_dma() and dma_to_pfn() doesn't have to be touched, because dev->dma_pfn_offset is only set if DT property 'dma-ranges' exists.

In dma_to_virt() and virt_to_dma() I added a test that would log a message when the returned address would differ between the two versions. Only dma_to_virt() triggered and only for usb devices. Then I printed the first one with a stack trace:

[    2.490090] Freeing unused kernel memory: 344K (c07c6000 - c081c000)
[    2.517552] usb 1-1: new high-speed USB device number 2 using dwc_otg
[    2.526709] dma_to_virt(1-1): p1=daf0a000, p2=daf0ab00
[    2.534344] CPU: 0 PID: 19 Comm: kworker/0:1 Not tainted 4.0.2+ #3
[    2.543207] Hardware name: BCM2708
[    2.549490] Workqueue: usb_hub_wq hub_event
[    2.556627] [<c0015e60>] (unwind_backtrace) from [<c0012c38>] (show_stack+0x20/0x24)
[    2.569208] [<c0012c38>] (show_stack) from [<c05692f4>] (dump_stack+0x20/0x28)
[    2.579911] [<c05692f4>] (dump_stack) from [<c04036f0>] (dwc_otg_urb_enqueue+0x280/0x37c)
[    2.594272] [<c04036f0>] (dwc_otg_urb_enqueue) from [<c03d8104>] (usb_hcd_submit_urb+0xc8/0x984)
[    2.608421] [<c03d8104>] (usb_hcd_submit_urb) from [<c03d9c48>] (usb_submit_urb+0x2dc/0x4e0)
[    2.622110] [<c03d9c48>] (usb_submit_urb) from [<c03da3b0>] (usb_start_wait_urb+0x54/0xcc)
[    2.635512] [<c03da3b0>] (usb_start_wait_urb) from [<c03da4f8>] (usb_control_msg+0xd0/0x108)
[    2.648907] [<c03da4f8>] (usb_control_msg) from [<c03d2648>] (hub_port_init+0x438/0xb68)
[    2.663201] [<c03d2648>] (hub_port_init) from [<c03d46d8>] (hub_event+0x5dc/0xee8)
[    2.676040] [<c03d46d8>] (hub_event) from [<c003aee8>] (process_one_work+0x13c/0x490)
[    2.689073] [<c003aee8>] (process_one_work) from [<c003be74>] (worker_thread+0x178/0x52c)
[    2.702507] [<c003be74>] (worker_thread) from [<c0040374>] (kthread+0xe0/0xfc)
[    2.713030] [<c0040374>] (kthread) from [<c000e8b0>] (ret_from_fork+0x14/0x24)

My dma_to_virt()

static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
{

    if (dev) {
        unsigned long pfn = dma_to_pfn(dev, addr);
        void *p1, *p2;
        static bool dumped;

        p1 = phys_to_virt(__pfn_to_phys(pfn));
        p2 = (void *)__bus_to_virt((unsigned long)addr);
        if (p1 != p2) {
            if (!dumped) {
                pr_info("%s(%s): p1=%p, p2=%p\n", __func__, dev_name(dev), p1, p2);
                dumped = true;
                dump_stack();
            }
        }
    }

    return (void *)__bus_to_virt((unsigned long)addr);
}
notro commented 9 years ago

dwc_otg_urb_enqueue()

    buf = urb->transfer_buffer;
    if (hcd->self.uses_dma) {
        /*
         * Calculate virtual address from physical address,
         * because some class driver may not fill transfer_buffer.
         * In Buffer DMA mode virual address is used,
         * when handling non DWORD aligned buffers.
         */
        //buf = phys_to_virt(urb->transfer_dma);
                // DMA addresses are bus addresses not physical addresses!
                buf = dma_to_virt(&urb->dev->dev, urb->transfer_dma);
    }
notro commented 9 years ago

Then we have this comment in arch/arm/include/asm/dma-mapping.h

/*
 * dma_to_pfn/pfn_to_dma/dma_to_virt/virt_to_dma are architecture private
 * functions used internally by the DMA-mapping API to provide DMA
 * addresses. They must not be used by drivers.
 */