raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
10.87k stars 4.89k forks source link

RPI3/CM3: SPI clock dividers seem to be fixed to power of 2 when running at 400MHz vpu #2286

Closed msperl closed 6 years ago

msperl commented 6 years ago

Hi!

Is this a "known" feature of the SOC that the spi clock divider is limited to a power of 2 on a raspberry 3/CM3 (and actually rounding up the value)?

Observed frequencies are 12.5MHz when requesting 20MHz (effective divider is 32) and 6.75 when requesting 10MHz (effective divider is 64).

Surprisingly this is not the case when configuring: core_freq=250 in boot.config. In this case I get 17.85MHz and 9.6MHz SPI Clock rate respectively.

Or is it maybe firmware related?

Thanks, Martin

pelwell commented 6 years ago

The clock infrastructure in Pi3/CM3 (BCM2710) is no different to the other Pi SoCs. Which kernel version are you running?

msperl commented 6 years ago

4.9.57 - I did check the spi driver for any customization from upstream (I know it quite well ;) and there is no difference I can see.

Again: the divider with only power of 2 comes into play only on 400MHZ VPU clock rates and I can control it with the config.txt parameter core_freq=250, which is also confirmed via cat /sys/kernel/debug/clk/vpu/clk_speed.

So I am wondering if something else is moving the HW into a powerOf2 mode (which is the only officially supported mode, but which has (so far) proven to be not reflecting reality) - my guess would be firmware is changing some parameters when running at higher clock speeds that are only available after signing a NDA...

msperl commented 6 years ago

Just now I am building a 4.14.0 upstream only kernel and will give it a try there...

pelwell commented 6 years ago

Hang on - it's obvious.

20 250 / 400 = 12.5 10 250 / 400 = 6.25

The problem is nothing to do with powers-of-2 dividers - it's that the wrong frequency is being used when calculating the divisor. Both the SPI and I2C drivers should calculate the divisors for every transfer, but this appears not to be working. core_freq=250 stops the core clock from changing, avoiding the problem.

I suggest you instrument the SPI driver to report the value returned from clk_get_rate in bcm2835_spi_transfer_one, then run your test on a quiet system and on one with a shell running:

$ while true; do true; done
msperl commented 6 years ago

OK, that calculation sounds reasonable.

But there is ONE big issue with what you are proposing for the spi and i2C drivers: Assume we start a long 64 (DMA) transfer at VPU=250MHz and then in the middle of the transfer the firmware decides to switch to to 400MHz.

That would increase the effective frequency by a factor of 1.6 bringing the spi clock out of specs with the device.

The whole situation is tricky and would require more interaction between the SPI device and the firmware (or finally get the linux kernel to control the clocks for power states with a fallback in the firmware...).

So that also goes for the AUX devices (SPI1/2, uart), I2C, SPI, sdhost, but to some extend also to the PL11 serial port and dpi (both have two clocks defined).

For all practical purposes it seems that there is a requirement to keep the VPU fixed when using any of those ports outside of the firmware...

Or what do you recommend instead?

pelwell commented 6 years ago

First of all, this is not a proposal - it's what the drivers do now. Secondly, the firmware will never spontaneously (i.e without the ARM asking) increase the core frequency, but it will decrease it in event of an over-temperature or under-voltage situation.

I think we need to do the test I outlined to understand why this (slightly flawed) mechanism isn't working as well as expected, but in order to avoid overclocking the busses there would need to be an interlock between the cpufreq or cprman driver and drivers for blocks slaved to the core clock to prevent untimely frequency increases.

pelwell commented 6 years ago

But ultimately, freezing the core clock is the easiest and most reliable workaround which also works for the 8250 UART (not the PL011 - that has its own clock source), which is why enable_uart=1 while the 8250 mini UART is the primary/console UART locks the core clock to 250MHz.

msperl commented 6 years ago

At least for the spi driver clk_get_rate(bs->clk);gets called in every spi_transfer_onecall right at the start.

So the driver is doing the best it can...

On the 4.14.0 upstream I added debugging, but I always seem to get:

root@raspcm3:~# dmesg -C; cat /sys/kernel/debug/clk/vpu/clk_rate ; /usr/bin/cansend can0 "155#555555"; sleep 1; dmesg
250000003
[  473.988858] mcp2517fd spi0.0: clock_rate= 250000003
[  473.988928] mcp2517fd spi0.0: clock_rate= 250000003
[  473.989107] mcp2517fd spi0.0: clock_rate= 250000003
[  473.989204] mcp2517fd spi0.0: clock_rate= 250000003
[  473.989243] mcp2517fd spi0.0: clock_rate= 250000003
[  473.989288] mcp2517fd spi0.0: clock_rate= 250000003
root@raspcm3:~#

even when I explicitly set core_clock=400! - this works on a foundation kernel!

Fondation kernel looks like this:

root@raspcdmesg -C; cat /sys/kernel/debug/clk/vpu/clk_rate ; /usr/bin/cansend can0 "155#555555"; sleep 1; dmesg; cat /sys/kernel/debug/clk/vpu/clk_rate
400000000
[  141.917415] mcp2517fd spi0.0: clock_rate= 400000000
[  141.917467] mcp2517fd spi0.0: clock_rate= 400000000
[  141.917631] mcp2517fd spi0.0: clock_rate= 400000000
[  141.917701] mcp2517fd spi0.0: clock_rate= 400000000
[  141.917739] mcp2517fd spi0.0: clock_rate= 400000000
[  141.917772] mcp2517fd spi0.0: clock_rate= 400000000
[  141.917844] NOHZ: local_softirq_pending 08
400000000
msperl commented 6 years ago

I believe the worsted case will always be the uart - you can not control when you receive input, so any change will mean some potential headache... Spi and i2c could add clock notifications (and afair could potentially veto a clock change while a transfer occurrs)

pelwell commented 6 years ago

On the upstream kernel it would be interesting to know if the clock rate is wrong or the reporting of it - what does vcgencmd measure_clock core report?

msperl commented 6 years ago

I have extended logging now to see how things work in finer detail (dumping requested speed, clock, effective speed and divider) on a downstream 4.9.57 kernel:

root@raspcm3:~dmesg -C; cat /sys/kernel/debug/clk/vpu/clk_rate ; /usr/bin/cansend can0 "155#555555"; sleep 1; dmesg; cat /sys/kernel/debug/clk/vpu/clk_rate
400000000
[   83.847556] mcp2517fd spi0.0: clock/requested/used_hz/div= 400000000/12500000/12500000/32
[   83.847613] mcp2517fd spi0.0: clock/requested/used_hz/div= 400000000/12500000/12500000/32
[   83.847774] mcp2517fd spi0.0: clock/requested/used_hz/div= 400000000/12500000/12500000/32
[   83.847844] mcp2517fd spi0.0: clock/requested/used_hz/div= 400000000/12500000/12500000/32
[   83.847883] mcp2517fd spi0.0: clock/requested/used_hz/div= 400000000/12500000/12500000/32
[   83.847917] mcp2517fd spi0.0: clock/requested/used_hz/div= 400000000/12500000/12500000/32
400000000

In this case we even have a power of 2 divider - the breadboard cabling i use does not allow the use of 20MHz reliably...

But the logic analyzer measures 1.15us for 9 bits, which is equal to: 7826087Hz: capture (note that it is a known fact that for programmed IO the SPI controller will add 1 idle cycle between transfers - this does not happen with DMA transfers - hence 9 cycles)

If you instead take 250MHz and apply the divider you get: 7.8125MHz...

This is very close to what we see.

But you also see that the clock rates dumped as expected to 400MHz - so where is the issue?

msperl commented 6 years ago

here for the foundation kernel:

root@raspcm3:~# uname -a;vcgencmd measure_clock core; cat /sys/kernel/debug/clk/vpu/clk_rate
Linux raspcm3.intern.sperl.org 4.9.57ms7-v7+ #62 SMP Wed Nov 15 22:03:51 UTC 2017 armv7l GNU/Linux
frequency(1)=250000000
400000000
msperl commented 6 years ago

So there is a mismatch there as well!

But I remember that the clock driver assumes stable frequencies ONLY changed by itself.

Only in that case will it update the cached values - or at least it only takes the registers of the clock itself into account but does not ask for changes up the clock tree...

msperl commented 6 years ago

I remember we already had a discussion around sending notifications from the firmware to the linux kernel for clock changes - we never came to a conclusion!

pelwell commented 6 years ago

we never came to a conclusion!

That's because it's hard to come up with a scheme that achieves the goals without sacrificing either performance or safety (in the case of thermal limiting if the ARM kernel is malfunctioning).

In the downstream tree the cprman driver has special case code for the core clock (see https://github.com/raspberrypi/linux/commit/b76c8d538204e7d932afa8707070936f53d697f6) that asks the firmware for the maximum clock rate and reports that rather than the actual current rate.

msperl commented 6 years ago

That is really bad from a design perspective - is there any way we can fix that?

For completeness here the registers for the clocks Upstream 4.14:

root@raspcm3:~# uname -a;head /sys/kernel/debug/clk/vpu/clk_rate /sys/kernel/debug/clk/vpu/regdump /sys/kernel/debug/clk/plla_core/clk_rate /sys/kernel/debug/clk/plla_core/regdump  /sys/kernel/debug/clk/plla/clk_rate /sys/kernel/debug/clk/plla/regdump ; vcgencmd measure_clock core
Linux raspcm3.intern.sperl.org 4.14.0msu7+ #168 Fri Nov 24 12:51:16 UTC 2017 armv7l GNU/Linux
==> /sys/kernel/debug/clk/vpu/clk_rate <==
250000003

==> /sys/kernel/debug/clk/vpu/regdump <==
ctl = 0x00000245
div = 0x00004000

==> /sys/kernel/debug/clk/plla_core/clk_rate <==
1000000012

==> /sys/kernel/debug/clk/plla_core/regdump <==
cm = 0x0000028a
a2w = 0x00000002

==> /sys/kernel/debug/clk/plla/clk_rate <==
2000000024

==> /sys/kernel/debug/clk/plla/regdump <==
cm_ctrl = 0x0000028a
a2w_ctrl = 0x00021034
frac = 0x00015556
ana0 = 0x00000000
ana1 = 0x00144000
ana2 = 0x00000000
ana3 = 0x00000100
VCHI initialization failed

Unfortunately the vchiq driver does not load to use the logging command to dump the config.

Downstream (with your patch):

root@raspcm3:~# uname -a;head /sys/kernel/debug/clk/vpu/clk_rate /sys/kernel/debug/clk/vpu/regdump /sys/kernel/debug/clk/plla_core/clk_rate /sys/kernel/debug/clk/plla_core/regdump  /sys/kernel/debug/clk/plla/clk_rate /sys/kernel/debug/clk/plla/regdump ; vcgencmd measure_clock core
Linux raspcm3.intern.sperl.org 4.9.57ms7-v7+ #62 SMP Wed Nov 15 22:03:51 UTC 2017 armv7l GNU/Linux
==> /sys/kernel/debug/clk/vpu/clk_rate <==
400000000

==> /sys/kernel/debug/clk/vpu/regdump <==
ctl = 0x00000245
div = 0x00004000

==> /sys/kernel/debug/clk/plla_core/clk_rate <==
1000000012

==> /sys/kernel/debug/clk/plla_core/regdump <==
cm = 0x0000028a
a2w = 0x00000002

==> /sys/kernel/debug/clk/plla/clk_rate <==
2000000024

==> /sys/kernel/debug/clk/plla/regdump <==
cm_ctrl = 0x0000028a
a2w_ctrl = 0x00021034
frac = 0x00015556
ana0 = 0x00000000
ana1 = 0x00144000
ana2 = 0x00000000
ana3 = 0x00000100
frequency(1)=250000000

I would need to go thru the code to calculate the actual values based on those live regdump values - these are the actual register values read (no caching)...

But for most practical purposes we have a divider of 4 on the VPU relative to the parent. Parent being index 5, which is plla_core.

Sothe core_clock has to be set to 250 whenever i2c, spi or uart is used?

msperl commented 6 years ago

As for a safe design here maybe a proposal:

So I believe all the system protection cases by the firmware can first get delegated to the kernel and if it does not react in time (or things become critical before the kernel can react), then the FW can act as it does now: clock down, disable power to USB,... without kernel consent (but a possible notification of the fact)

Does this solve the concerns?

pelwell commented 6 years ago

I advise setting the core frequency to 250MHz if using any of the core-clock-dependent peripherals if you care about getting close to the frequency you ask for. Pi 3 users who have enabled the default UART are already in this position, so I don't see it as a big price to pay.

pelwell commented 6 years ago

We're not going to change the clocking strategy without very good reason.