hoglet67 / RGBtoHDMI

Bare-metal Raspberry Pi project that provides pixel-perfect sampling of Retro Computer RGB/YUV video and conversion to HDMI
GNU General Public License v3.0
837 stars 112 forks source link

Use PLLA or PLLD as the GPCLK source #9

Closed hoglet67 closed 5 years ago

hoglet67 commented 5 years ago

Seems this should be possible.

Advantage is that the core clock speed can remain at 400MHz.

Not clear that it can be set via the firmware property interface.

So would need to include some code to set the PLL registers directly.

Possible code to do this here: https://github.com/F5OEO/librpitx/blob/8d0ff49fa7d7dc5f98c83a9876000254ae9edbaf/src/gpio.cpp#L139

hoglet67 commented 5 years ago

Another thought: this would probably overcome the issue preventing us using the latest firmware: https://github.com/raspberrypi/firmware/issues/1022

hoglet67 commented 5 years ago

I've made an attempt at this, in the end using PLLD.

I believe that:

It would be good to make sure this is the case!

I picked PLLD as it's enabled by default, and gives a GPCLK source of 500MHz (prior to the GPCLK divider). I believe the PLL is actually running at 1000MHz in this case.

Rather arbitrarily, I've allowed the PLL range to be 1000MHz to 15000MHz, giving a GPCLK source of 500MHz to 750MHz. That should be sufficient range, given we can set the GPCLK divider to pretty much any integer value we like.

I couldn't find a mailbox set_clock call that worked for either PLLA or PLLA, so in the end I re-used the same code we've been using to set PLLH by writing to the registers.

I'll push this to dev now, and if you spot any issues Ian then please let me know. I've tested all the profiles, and none of them hang, but that's as far as I've been able to test.

There are two main benefits to making this change:

  1. We can now set the core clock to 400MHz independently, which might help in some of the capture loops
  2. We can now use the latest firmware (I did just quickly try this, and it's fine).
hoglet67 commented 5 years ago

Some links about the PLLs and/or clock tree: https://patchwork.kernel.org/patch/8455241/ https://github.com/miegl/PiFmAdv/blob/master/src/pi_fm_adv.c#L409 https://raspberrypi.stackexchange.com/questions/83583/what-are-the-vcgencmd-measure-clock-26-and-50?rq=1

IanSB commented 5 years ago

This seems to work OK so far with the PC based profiles on the Pi zero. It also seems to work a lot better on the PI3 and now works on the 3+ with new firmware: Mode 0 double height no longer has glitches and I can even select double height and width with no glitches but strangely double width on its own has glitches. On the PC side CGA mode (14.3Mhz clock) is glitch free in double width plus double height & width but has glitches in double height only which is different behaviour to BBC mode 0.

I can even get glitch free EGA (640x350) in double height & width on the Pi 3 and that doesn't work on the Pi zero although the Pi3 still won't work at VGA resolutions. Definitely something strange about the behaviour of the Pi 3 when some lower bandwidth things don't work properly and some higher bandwidth things do. I tried setting the core to 500Mhz and that helped a bit with VGA but CGA double height had exactly the same glitches so perhaps there is still some dynamic switching feature on the Pi3 that needs to be disabled.

Mode 7 on the Pi3 is still a bit glitchy but less so and I suspect some reworking of the capture loop might fix that so maybe I'll look at that someday. Overall there seems to be more headroom as I suppose the core memory speed is no longer reduced to match the sample clock. The Pi3 still won't run at the highest VGA speeds but it certainly has more potential to work with all the standard TV and PC CGA/EGA modes in some form or other if mode 7 can be fixed. Seems like it's worth updating the firmware to the latest one now as well.

hoglet67 commented 5 years ago

Good to hear you've not hit any issues with this.

I'm still slightly nervous about what else might be using PLLD by default. I do wonder if it's being used by the SDRAM, and by changing it from 500MHz to (upto) 750MHz we are inadvertently overclocking it quite significantly.

Also, I suspect the PLL is actually running at 4x this rate. The documented range of the PLLs is 600MHz-2400MHz, so we should probably use 400MHz->600MHz, not 500MHz->750MHz.

The highest clock we need to generate is 192MHz isn't it? What other highish ones are there?

I'm going to investigate this some more today, and see if I can somehow dump the whole clock tree.

So in the mean time, hold off making any other changes that rely on this being in place.

IanSB commented 5 years ago

The highest PC clocks are: VGA text: 28.322Mhz x 6 = 169.932Mhz VGA Graphics: 25.175Mhz x 6 = 151.05Mhz I noticed that with the old code, the highest pixel clocks didn't necessarily produce the highest PLL clock speed due to the dividers selected so that may still be true Some other pixel clocks used by PC = 16Mhz, 16.257Mhz, 14.318181Mhz , 14.161Mhz, 14.171Mhz, 18.013483 Mhz

hoglet67 commented 5 years ago

There's good evidence that PLLD is used as the parent clock source for the SDRAM PLL:

    plld                                  3            3  2000000024          0 0
       plld_dsi1                          0            0     7812501          0 0
       plld_dsi0                          0            0     7812501          0 0
       plld_per                           3            3   500000006          0 0
          gp1                             1            1    25000000          0 0
          hsm                             0            0   163682866          0 0
          uart                            1            2    47999625          0 0
       plld_core                          2            2   500000006          0 0
          sdram 

(https://raspberrypi.stackexchange.com/questions/83583/what-are-the-vcgencmd-measure-clock-26-and-50?rq=1)

So I've been investigating using PLLA instead.

My understanding is the PLLA is generally only used when you specify GPU clocks values that cannot be generated from the Core PLL (PLLC), and we have no need to do that.

These are the initial PLLA register values:

DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=2 PER=256 CCP2=256

There are four dividers following PLLA:

A value of 256 means the divider is disabled.

This explains why trying to use PLLA_PER as a GPCLK source has not worked for people - it's off by default.

So, I've been messing around with some code that will do two things:

It's a bit fiddly to do this, because it's all rather undocumented, and and there are various controls to prevent accidentally writing to the registers. The code to do (configure_plla) this is not long though: https://github.com/hoglet67/RGBtoHDMI/blob/dev/src/rgb_to_hdmi.c#L839

Anyway, the result is:

CCP2=256
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=256 PER=4 CCP2=256

So CORE is now disabled, and PER enabled with a divider of 4.

With these values, our system will run nicely using PLLA as the source.

The PLLA_PER clock range is currently 400MHz-600MHz - this should allow any clock we need up to 200MHz to be generated accurately with an integer GPCLK divider of >= 2.

I've just pushed this.

Dave

IanSB commented 5 years ago

This new version breaks high pc clock rates The 28Mhz clock is unusable and the screen is more glitches than video The 25Mhz clock has a few minor glitches Some log snippets:

INFO: Error adjusted clock = 28321714 Hz INFO: GPCLK Divisor = 3 INFO: Target PLL frequency = 1019581728 Hz INFO: Actual PLL frequency = 1019581728 Hz INFO: Core clock freq = 400000000 Hz INFO: Lines per frame = 449, (448.998)

INFO: Error adjusted clock = 25174627 Hz INFO: GPCLK Divisor = 3 INFO: Target PLL frequency = 906286590 Hz INFO: Actual PLL frequency = 906286590 Hz INFO: Core clock freq = 400000000 Hz INFO: Lines per frame = 525, (524.997)

hoglet67 commented 5 years ago

I can look at that now.

Which profile exactly are these?

hoglet67 commented 5 years ago

I've just tried the TSEND ET3000AX/VGA Text.

The correct (ish) frequency clock is being output, i.e. ~170MHz, so I've not made a stupid error withe the dividers.

So this has to be something more subtle.

Forget for now yesterday's PLLD version, as the SDRAM could will have been very overclocked.

Did these profiles work perfectly in the previous version that varied the Core Clock?

IanSB commented 5 years ago

Yes, although it would be nice to use the direct register code with the core clock PLL as that would allow the new firmware to be used and also confirm there is no other issue with the new PLL code.

hoglet67 commented 5 years ago

I've hacked an Atom profile to use a 28.63MHz clock, and that seems to work correctly for me:

sampling=1,1,1,1,1,1,1,0,6,1,0,0
geometry=268,11,1216,240,304,240,0,8,28636363,1824,0,262,4,0
palette=8

On the scope I can see it using a 171.8MHz clock, and there is no jittle/noise at all.

Each pixel is being sampled 4 times, so the horizonal capture width is 1216 pixels.

If I enable Double Height, I see a tiny amount of jitter. Double Width, and Double Height + Double Width seem better.

Let me see if I can make a version that uses the Core Clock PLL, but varies it with direct register access,

IanSB commented 5 years ago

This is what I get with the profile that worked with the core clock:

capture29

The first few pixels on each line are good but as it progresses across the line, more and more PSYNC edges are missed.

hoglet67 commented 5 years ago

Which firmware are you using?

Can you include a this part of the debug boot log:

INFO: Profile = Atom_V6
INFO: 
INFO: 
INFO: **********     Raspberry Pi RGB to HDMI Convertor     **********
INFO: 
INFO: 
INFO:     FIRMWARE_VERSION : 5c9b9ad8 
INFO: 
INFO:          BOARD_MODEL : 00000000 
INFO: 
INFO:       BOARD_REVISION : 009000c1 
INFO: 
INFO:    BOARD_MAC_ADDRESS : baeb27b8 
INFO:    BOARD_MAC_ADDRESS : 5f38074b 
INFO: 
INFO:         BOARD_SERIAL : eaba4b07 
INFO:         BOARD_SERIAL : 00000000 
INFO: 
INFO:            EMMC_FREQ :    200.000 MHz    200.000 MHz    200.000 MHz state=1
INFO:            UART_FREQ :     48.000 MHz   1000.000 MHz   1000.000 MHz state=1
INFO:             ARM_FREQ :   1000.000 MHz   1000.000 MHz   1000.000 MHz state=1
INFO:            CORE_FREQ :    400.000 MHz    400.000 MHz    400.000 MHz state=1
INFO:             V3D_FREQ :    300.000 MHz    300.000 MHz    300.000 MHz state=1
INFO:            H264_FREQ :    300.000 MHz    300.000 MHz    300.000 MHz state=1
INFO:             ISP_FREQ :    300.000 MHz    300.000 MHz    300.000 MHz state=1
INFO:           SDRAM_FREQ :    450.000 MHz    450.000 MHz    450.000 MHz state=1
INFO:           PIXEL_FREQ :    148.500 MHz  -1894.967 MHz  -1894.967 MHz state=1
INFO:             PWM_FREQ :      0.000 MHz    500.000 MHz    500.000 MHz state=0
INFO:            CORE TEMP :  42.77 °C
INFO:         CORE VOLTAGE :   1.35 V
INFO:      SDRAM_C VOLTAGE :   1.20 V
INFO:      SDRAM_P VOLTAGE :   1.20 V
INFO:      SDRAM_I VOLTAGE :   1.20 V
INFO:             CMD_LINE : bcm2708_fb.fbwidth=1920 bcm2708_fb.fbheight=1080 bcm2708_fb.fbdepth=16 bcm2708_fb.fbswap=1 dma.dmachans=0x7f35 bcm2708.boardrev=0x9000c1 bcm2708.serial=0xeaba4b07 bcm2708.uart_clock=48000000 bcm2708.disk_led_gpio=47 smsc95xx.macaddr=B8:27:EB:BA:4B:07 vc_mem.mem_base=0x1fa00000 vc_mem.mem_size=0x20000000  sampling06=3 sampling7=0,2,2,2,2,2,2,0,8,5 info=1 palette=0 deinterlace=6 scanlines=0 mux=0 elk=0 vsync=0 vlockmode=0 vlockline=5 nbuffers=2 debug=0 m7disable=0 keymap=123233 return=1
IanSB commented 5 years ago

I was using the most recent firmware that also worked with the PI3+ but I just reverted back to the standard version and that made no difference. Here's the log:

INFO: **********     Raspberry Pi RGB to HDMI Convertor     **********
INFO:
INFO:
INFO:     FIRMWARE_VERSION : 5a9d7465
INFO:
INFO:          BOARD_MODEL : 00000000
INFO:
INFO:       BOARD_REVISION : 00900093
INFO:
INFO:    BOARD_MAC_ADDRESS : a6eb27b8
INFO:    BOARD_MAC_ADDRESS : 5f38b64e
INFO:
INFO:         BOARD_SERIAL : 3fa64eb6
INFO:         BOARD_SERIAL : 00000000
INFO:
INFO:            EMMC_FREQ :    250.000 MHz    250.000 MHz    250.000 MHz state=1
INFO:            UART_FREQ :     48.000 MHz   1000.000 MHz   1000.000 MHz state=1
INFO:             ARM_FREQ :   1000.000 MHz   1000.000 MHz   1000.000 MHz state=1
INFO:            CORE_FREQ :    400.000 MHz    400.000 MHz    400.000 MHz state=1
INFO:             V3D_FREQ :    300.000 MHz    300.000 MHz    300.000 MHz state=1
INFO:            H264_FREQ :    300.000 MHz    300.000 MHz    300.000 MHz state=1
INFO:             ISP_FREQ :    300.000 MHz    300.000 MHz    300.000 MHz state=1
INFO:           SDRAM_FREQ :    450.000 MHz    450.000 MHz    450.000 MHz state=1
INFO:           PIXEL_FREQ :    148.500 MHz  -1894.967 MHz  -1894.967 MHz state=1
INFO:             PWM_FREQ :      0.000 MHz    500.000 MHz    500.000 MHz state=0
INFO:            CORE TEMP :  41.16 °C
INFO:         CORE VOLTAGE :   1.35 V
INFO:      SDRAM_C VOLTAGE :   1.20 V
INFO:      SDRAM_P VOLTAGE :   1.20 V
INFO:      SDRAM_I VOLTAGE :   1.20 V
INFO:             CMD_LINE : bcm2708_fb.fbwidth=1920 bcm2708_fb.fbheight=1080 bcm2708_fb.fbdepth=16 bcm2708_fb.fbswap=1 dma.dmachans=0x7f35 bcm2708.boardrev=0x900093 bcm2708.serial=0x3fa64eb6 bcm2708.uart_clock=48000000 bcm2708.disk_led_gpio=47 smsc95xx.macaddr=B8:27:EB:A6:4E:B6 vc_mem.mem_base=0x1fa00000 vc_mem.mem_size=0x20000000  console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 rootwait
hoglet67 commented 5 years ago

I've created a test build in branch pllc_direct_access.

This uses PLLC with direct register access.

Changes are quite small: 1e909110

There are a few UART glitches when the frequency changes, but just ignore that.

It would be great if you could test this, with and without this line commented:

   // Enable the PLLA_PER divider
   //configure_plla(PLL_PER_DIVIDER);

(it should make no difference to the use of PLLC, but who knows....)

IanSB commented 5 years ago

OK I found the problem: I had merged in my recent pull request when building and the problem is caused by uncommenting // #define USE_PROPERTY_INTERFACE_FOR_FB When the FB is created by the property interface call I get glitches, when it is created by the deprecated method it works.

(This is with PLLA)

hoglet67 commented 5 years ago

Great, that explains why I couldn't reproduce it at this end.

Shall I merge the updated pull request?

hoglet67 commented 5 years ago

I've just compared logs without USE_PROPERTY_INTERFACE_FOR_FB. The only significant difference I can see is in the alignment of the frame buffer returned: (1) Framebuffer address: DF7FD000 (without USE_PROPERTY_INTERFACE_FOR_FB) (2) Framebuffer address: DE79AEAC (with USE_PROPERTY_INTERFACE_FOR_FB)

With my Atom VX Test Profile (28MHz Pixel Clock, 6bpp sampling, Normal FM Size), I do see a big difference between them. (1) is perfect, (2) is definitely not: capture1 capture2

Ignore the aspect ratio, the frame buffer is 1920x270

hoglet67 commented 5 years ago

If you force an alignment, the issue goes away!

IanSB commented 5 years ago

Shall I merge the updated pull request?

Yes, it's only using USE_PROPERTY_INTERFACE_FOR_FB on the Pi 3 now and I suspect a lot of the performance issues are down to that However the latest PLLA build won't run at all on the Pi3, it just hangs during boot at:

INFO: RGB to HDMI booted

Any Ideas what's causing that?

It does run with the PLLC build so maybe the Pi3 is using something extra in PLLA

If you force an alignment, the issue goes away!

I recall the early ARMs used paged mode access to RAM for a speed improvement (they had no cache) so maybe the unaligned buffer is causing a slowdown in RAM access to the uncached screen.

If we can get it running on the PI3 again I'd be interested in the performance difference with a forced alignment

hoglet67 commented 5 years ago

OK, I've added a 64KB Alignment, and gone back to USE_PROPERTY_INTERFACE_FOR_FB for all models. That's now working with my Atom VX test case with no jitter.

This is pushed now to dev.

Regarding the Pi 3 issue, has it ever worked with a build that uses PLLA?

Is your log in Debug mode? I'm surprised you don't get a few more lines, e.g. the Cache being enabled.

INFO: ***********************RESET***********************
INFO: RGB to HDMI booted
DEBUG: enable_MMU_and_IDCaches
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=2 PER=256 CCP2=256
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=256 PER=4 CCP2=256
DEBUG: Setting up divisor
DEBUG: A GP_CLK1_DIV = 00000000
DEBUG: B GP_CLK1_CTL = 00000200
DEBUG: C GP_CLK1_CTL = 00000004
DEBUG: D GP_CLK1_CTL = 00000004
DEBUG: E GP_CLK1_CTL = 00000004
DEBUG: F GP_CLK1_CTL = 00000014
DEBUG: G GP_CLK1_CTL = 00000094
DEBUG: H GP_CLK1_DIV = 00006000
DEBUG: Done setting up divisor
INFO: CPLD  Design: Normal

I suspect disabling the PLLA_CORE divider is causing the problem. It's possible that it's used on the Pi 3 for the UART.

I don't have a functional Pi 3 to test with.

Can you try commenting out these lines:

   // Disable PLLA_CORE divider (to check it's not being used!)
   CM_PLLA            = CM_PASSWORD | (((CM_PLLA) & ~CM_PLLA_LOADCORE) | CM_PLLA_HOLDCORE);
   gpioreg[PLLA_CORE] = CM_PASSWORD | (A2W_PLL_CHANNEL_DISABLE);

Could you post a full debug log, regardless of whether this helps or not.

IanSB commented 5 years ago

No, it's never worked with the PLLA build, just the PLLC and PLLD builds

Here's a debug log of the PI3 hang:

INFO: RGB to HDMI booted
DEBUG: enable_MMU_and_IDCaches
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=2 PER=256 CCP2=256
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=256 PER=4 CCP2=256
DEBUG: Setting up divisor
DEBUG: A GP_CLK1_DIV = 00014000
DEBUG: B GP_CLK1_CTL = 00000096
DEBUG: C GP_CLK1_CTL = 00000084

EDIT:

Can you try commenting out these lines:

That didn't work but it looks like it's hanging up in init_gpclk which is before that anyway

hoglet67 commented 5 years ago

Yes, I agree it's hanging in init_gpclk. So this line is interesting:

DEBUG: B GP_CLK1_CTL = 00000096

(bits 7, 4, 2, 1) That's the initial value of the GPCLK1 control register before we do anything.

On a Pi Zero it's:

DEBUG: B GP_CLK1_CTL = 00000200

(bits 8)

The bits are:

3..0 are the clock source (0=disabled, 6=PLLD)

4 - CM_GP1CTL_ENAB 7 - CM_GP1CTL_BUSY 8 - CM_GP1CTL_BUSYD

So it looks like on the Pi 3 the GPCLK1 is running by default. I think that might be because it's used for the Ethernet Phy clock, so the firmware has already configured it.

Possibly I need to cleanly stop it first.

But that wouldn't explain why it starts up when using PLLC and PLLD.

So more likely, I think, is that PLLA_PER is not running (that caused me a hang, but later on after the log line marked F.

I'll have a think.

IanSB commented 5 years ago

OK, I've added a 64KB Alignment, and gone back to USE_PROPERTY_INTERFACE_FOR_FB for all models.

I just tried adding the 64K alignment to the PLLC code to see how that ran on the PI3 and it doesn't work, it behaves exactly the same way as using the deprecated call on the PI3 so I assume that call is actually working but setting up a similar alignment which for some reason doesn't work properly on the PI3. Using either the deprecated call or the official call with alignment results in a full initialisation and blank screen but the code then calls rgbtofb and hangs up in that so I think it must have an incorrect address for the screen buffer at least.

hoglet67 commented 5 years ago

I'm confused - I thought you had the Pi 3 "sort-of" working yesterday. What's different now?

IanSB commented 5 years ago

I'm confused - I thought you had the Pi 3 "sort-of" working yesterday. What's different now?

I was using the PLLC build and changed:

     RPI_PropertyAddTag(TAG_ALLOCATE_BUFFER);
     to
     RPI_PropertyAddTag(TAG_ALLOCATE_BUFFER, 0x10000);

and that's enough to stop it working on the Pi3 with the same symptoms as using the deprecated call

hoglet67 commented 5 years ago

Try a smaller value of alignment, 0x1000 or 0x20 for example.

What framebuffer addresses are coming back.

hoglet67 commented 5 years ago

I've just pushed a change to dev that might get slightly further through gpclk_init() on the Pi 3.

Could you give it a try, and post a debug log please?

IanSB commented 5 years ago

What framebuffer addresses are coming back.

FD8B57E8 - no alignment
FF884000 - using 0x10, 0x100 or 0x1000 alignment
FF880000 - using 0x10000
FF800000 - using 0x100000

Only the unaligned one works The hang appears to be in rgbtofb as that is called but it never returns, even when keys are pressed. I'm not really sure what would cause a hangup in there unless writing to non-existant screen ram is causing an unhandled exception. What happens when you write to unallocated memory space?

IanSB commented 5 years ago

I've just pushed a change to dev that might get slightly further through gpclk_init() on the Pi 3

It gets a little further:

INFO: RGB to HDMI booted
DEBUG: enable_MMU_and_IDCaches
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=2 PER=256 CCP2=256
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=256 PER=4 CCP2=256
DEBUG: Setting up divisor
DEBUG: A GP_CLK1_DIV = 00014000
DEBUG: B GP_CLK1_CTL = 00000096
DEBUG: C GP_CLK1_CTL = 00000086
DEBUG: D GP_CLK1_CTL = 00000006
DEBUG: E GP_CLK1_CTL = 00000004
DEBUG: F GP_CLK1_CTL = 00000014
hoglet67 commented 5 years ago

I wonder of PLLA is completely disabled on the Pi 3. Could you add the following to the start of gpclk_init():

log_debug("XOSC_CTRL = %08x", gpioreg[XOSC_CTRL]);

Bit 6 is A2W_XOSC_CTRL_PLLAEN

Bit 18 is A2W_XOSC_CTRL_PLLAOK

On the Pi Zero, I thing this defaults to 0x000ff0ff

On the frame buffer alignment issue, that's really weird. I'll need to do some more digging.

Not having a functional Pi 3 is going to make this a pain to debug!

IanSB commented 5 years ago

Could you add the following to the start of gpclk_init():

INFO: ***********************RESET***********************
INFO: RGB to HDMI booted
DEBUG: enable_MMU_and_IDCaches
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=2 PER=256 CCP2=256
DEBUG: PLLA: 1200.000000
DEBUG: PLLA: PDIV=1 NDIV=62 CTRL=0002103e FRAC=524288 DSI0=256 CORE=256 PER=4 CCP2=256
DEBUG: Setting up divisor
DEBUG: XOSC_CTRL = 000ff0ff
DEBUG: A GP_CLK1_DIV = 00014000
DEBUG: B GP_CLK1_CTL = 00000096
DEBUG: C GP_CLK1_CTL = 00000086
DEBUG: D GP_CLK1_CTL = 00000006
DEBUG: E GP_CLK1_CTL = 00000004
DEBUG: F GP_CLK1_CTL = 00000014
hoglet67 commented 5 years ago

Hmm, thanks. That looks fine. I'm out of suggestions for now.

Tomorrow I'll dig out a Pi 2 that I have somewhere, and see if that also has the same problem.

hoglet67 commented 5 years ago

It seems I can actually use the Pi 2 to work on both of these issues.

So first the frame buffer address/alignment issue, when using the property tag interface.

I can confirm that frame buffer addresses that start with 0xFF are definitely bogus, because that maps to where the hardware registers are placed. Hence the nasty crash on clear_screen.

I can also confirm that when no parameter is specified to TAG_ALLOCATE_BUFFER call, then a random value is actually used, due to the way rpi-mailbox-interface.c is written. In my case, that seemed to be 0x02140704, hence the weird alignment you get. So a value definitely should be passed.

Based on this, I tried passing in 0x02000000 (32MB) and that actually worked. The frame buffer the started at 0xFE000000.

I also tried passing in 0x04000000 (64MB) and that worked as well. The frame buffer the started at 0xFC000000.

It would be interesting to see if you get the same results on the Pi 3.

This is definitely very weird behaviour, and no one else seems to be seeing it.

IanSB commented 5 years ago

It would be interesting to see if you get the same results on the Pi 3.

Yes that works on the Pi3 as well and there is definitely an improvement in performance once aligned with double height etc working in modes 0-6 and PC resolutions. Mode 7 is still glitchy with that stall we noted previously but overall a significant improvement in performance.

hoglet67 commented 5 years ago

I've pushed some more bit and pieces, mostly to do with making it easier to use different PLLs in the builds for each of the Pi models.

I've been doing some testing on the Pi 2, with PLLC and PLLD, using the Atom V6 profile. With both, I'm getting some very inconsistent results. There seems to be an issue with the code the measures the time for 100 lines.

Sometimes it works correctly:

INFO:         clkinfo.clock = 7159090 Hz
INFO:      clkinfo.line_len = 456
INFO:     clkinfo.clock_ppm = 0 ppm
INFO:     Nominal 100 lines = 6369500 ns
INFO:      Actual 100 lines = 6370742 ns
INFO:           Clock error = 194 PPM
INFO:  Error adjusted clock = 7157694 Hz
INFO:         GPCLK Divisor = 10
INFO:  Target PLL frequency = 1145231080 Hz
INFO:  Actual PLL frequency = 1145231080 Hz
INFO:       Lines per frame = 262, (261.999)
INFO: Actual frame time = 16691352 ns (non-interlaced), line time = 63707 ns

And sometimes it's way out:

INFO:         clkinfo.clock = 7159090 Hz
INFO:      clkinfo.line_len = 456
INFO:     clkinfo.clock_ppm = 0 ppm
INFO:     Nominal 100 lines = 6369500 ns
INFO:      Actual 100 lines = 3822446 ns
INFO:           Clock error = -399882 PPM
INFO:  Error adjusted clock = 11929488 Hz
INFO:         GPCLK Divisor = 6
INFO:  Target PLL frequency = 1145230848 Hz
INFO:  Actual PLL frequency = 1145230848 Hz
INFO:       Lines per frame = 262, (261.999)
INFO: Actual frame time = 10014818 ns (non-interlaced), line time = 38224 ns

What's weird is when it's out, it's always out by the same ratio (6/10) so it things the source rate is 100Hz, rather than 60MHz.

I'm guessing the ARM clock being used for the time measurements is somehow being modified.

What I don't understand is why it's not consistent. I have genlock off by default. Usually, just pressing the genlock button will fix it.

It's hard to debug, because it never happens with the debug build.

IanSB commented 5 years ago

It's hard to debug, because it never happens with the debug build.

I found this comment about the Pi2: Regular dynamic downclocking of the CPU can occur due to USB power supply/cable issue

BTW I found the debug builds to be unusable with autoswitching because the debug messages printed during recalculate_hdmi_clock (which is called indirectly from the assembler frame loop) cause the measurement of a frame to vary which results in a drop out of frame capture and an attempt to select another profile repeatedly. I think all debug messages in recalculate_hdmi_clock and your new pll code relating to the HDMI clock need to be optionally suppressed using a #define One other thing: There is an error in the BBC_Micro.txt profile: sampling=2,2,2,2,2,2,2,0,6,5,0 should be sampling=2,2,2,2,2,2,2,0,6,0,5,0 due to the new mux setting

EDIT: I just tried this build with a Pi 1 Model B+ and that worked OK as well as the Pi 3 and 3+ so it looks like this project will now work reasonably well with any Pi except the original Pi1 (due to lack of GPIO). Pi zero is still the best so there's no point in buying anything else but if you have one of the others lying around it can be put to use. The 3 and 3+ still have some minor glitches with double height but as noted earlier they disappear when using double height & width which is very strange. I wonder if having core, sdram and cpu clocks with non-simple ratios causes some sort of synchronisation delay on the pi2 & 3. (this would be the case using PLLC or PLLD but not PLLA)

hoglet67 commented 5 years ago

I've fixed the BBC_Micro profile issue, thanks for spotting that.

Yes, it was indeed a power supply issue causing ARM throttling from 1000MHz down to 600MHz. I'm running the Pi 2 off a separate supply now and it's fine.

I've just updated the Pi firmware blob to the latest version, and that seems fine as well.

The TSENG ET3000AX/VGA Text suffers extreme jitter on the Pi 2, but that's to be expected I guess.

IanSB commented 5 years ago

Very high pixel clock rates like the TSENG ET3000AX/VGA Text & VGA Graphics only work on the Pi zero but the 50 & 60 Hz TV standards + EGA can be made to work on the others. The only exception is mode 7 deinterlace which still doesn't work on the Pi3 (and I presume the Pi2) although I have some ideas to improve that which I might look at later as those speed up ideas would be needed for a further improvement to mode 7 deinterlace to eliminate the remaining 1 frame of 'fizz' when a character changes.

hoglet67 commented 5 years ago

Closing this issue, as I think everything I intended to do has been done.

The only outstanding part was I never managed to get PLLA to work as a GPCLK source on a Pi 2/3. I tried pretty much everything I could think of, but to no avail.

hoglet67 commented 4 years ago

See #134 for why PLLA didn't work on the Pi 2/3 (Thanks Ian)