Test GPU (VisionTek Radeon 5450 1GB)

geerlingguy commented 3 years ago

I want to see if an AMD card works out of the box with the drivers built into Linux, as everyone on the Internet seems to say. For X86 Linux, that definitely seems to be the case, but will it work on ARM32? ARM64?

I settled on the Radeon HD 5450 1GB PCIe 2.1 card, mostly because it's available at a local retailer for $35.

Phoronix did a pretty extensive article on this board, and while it's no screamer... or even that fast... it is a simple, fanless, low-power board, and that might just be perfect for the Pi CM4. Here's that article: ATI Radeon HD 5450 On Linux.

I don't expect it to be fast, or amazing, but I do expect to get it to work. Maybe.

Related links:

Gist: Increase the BAR memory address space for PCIe devices on CM4
Gist: Setting up the Nvidia GeForce GT 710 on Raspberry Pi Compute Module 4
Pi Forum Topic: BAR space for PCIe allocation on CM4?

Coreforge commented 3 years ago

Since I can't get the debugger to work, I'm using printk to check where it crashes. It probably crashes when I'm copying the BIOS buffer into a temporary buffer to print it to then check if it's a read error or if something else is happening. I'm not too great with C, so it's probably a pretty easy error. atombios seems to be a bios in bytecode that gets interpreted by a specific interpreter that is in the PCI ROM for x86 or in the driver for other architectures. It doesn't seem to be exclusive to evergreen, so I don't know why that's in evergreen.c, but it makes sense it's using it. Removing the check would probably result in an error further down the line when it's being interpreted, so that's not really a good option. One way to get around this could be to read out the bios with a PC and copy it over to the pi, and modify the driver so that it uses the image instead of reading it out. The fglrx driver apparently does that when it can't read the bios correctly, but that one isn't available for arm. The code for copying parts of the bios:

void *testbios = kmalloc(128,GFP_KERNEL);
    memcpy(testbios,rdev->bios_header_start,128);
    printk("first 128 bytes after BIOS header start: %*ph",testbios);
    memcpy(testbios,rdev->bios,128);
    printk("first 128 bytes from bios: %*ph",testbios);
    kfree(testbios);

My guess is that either testbios isn't being properly allocated, or one of the addresses is NULL.

Coreforge commented 3 years ago

Got it working. The first 64 bytes after the start of the header are 00 00 01 01 41 54 01 01 41 54 c3 03 74 01 c3 03 74 01 22 04 00 00 22 04 00 00 04 a0 04 02 04 a0 04 02 a0 00 00 00 a0 00 00 00 02 10 79 67 02 10 79 67 00 00 00 03 00 00 00 03 00 00 00 00 00 00 The ATOM magic starts at the 5th byte, so the AT are there, but the second half is missing.

The start of the BIOS is 55 aa 80 e9 55 aa 80 e9 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 04 02 00 00 04 02 00 00 4d 44 00 00 4d 44 00 00 00 00 00 00 00 00 00 00 20 37 36 31 20 37 36 31 32 30 00 00 32 30 00 00 I'll dump the full BIOS on the PI to see what's going on. I'll also try to remove the check and see what's happening.

Coreforge commented 3 years ago

Removing the ATOM magic check gets it further (obviously), but it just fails later, as expected.


[   75.096398] [drm:radeon_atombios_init [radeon]] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO
[   75.096409] Invalid ATI magic
[   75.096419] radeon 0000:01:00.0: Fatal error during GPU init
[   75.096428] [drm] radeon: finishing device.
[   75.096434] [TTM] Memory type 2 has not been initialized
[   75.104133] radeon: probe of 0000:01:00.0 failed with error -12

Since the ATI magic is invalid too, I don't think just removing all these checks is going to work well. File IO doesn't seem to easy in kernel modules, so I haven't dumped the BIOS yet. A BIOS image can just be in a header file however, so I'll try that next.

Coreforge commented 3 years ago

It's getting further with the BIOS image now. It's just a rom from techpowerup since I didn't want to go through the trouble of reading it out right now. The pi crashes before though and locks up shortly after, but it still prints for a bit in dmesg. It doesn't react to anything though (neither serial nor ssh).

[   68.275242] ------------[ cut here ]------------
[   68.275244] field width 1152254592 too large
[   68.275245] WARNING: CPU: 1 PID: 724 at lib/vsprintf.c:2506 set_field_width+0x94/0xa0
[   68.275247] Modules linked in: radeon(+) i2c_algo_bit ttm sha256_generic cfg80211 rfkill 8021q garp stp llc vc4 cec v3d drm_kms_helper gpu_sched drm bcm2835_v4l2(C) bcm2835_codec(C) bcm2835_isp(C) v4l2_mem2mem bcm2835_mmal_vchiq(C) videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops snd_soc_core videobuf2_v4l2 videobuf2_common videodev drm_panel_orientation_quirks raspberrypi_hwmon snd_compress snd_bcm2835(C) snd_pcm_dmaengine snd_pcm mc snd_timer vc_sm_cma(C) snd syscopyarea sysfillrect sysimgblt rpivid_mem fb_sys_fops backlight uio_pdrv_genirq uio i2c_dev ip_tables x_tables ipv6
[   68.275299] CPU: 1 PID: 724 Comm: insmod Tainted: G         C        5.10.2-v8+ #2
[   68.275300] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[   68.275302] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[   68.275303] pc : set_field_width+0x94/0xa0
[   68.275304] lr : set_field_width+0x94/0xa0
[   68.275305] sp : ffffffc01200b3f0
[   68.275307] x29: ffffffc01200b3f0 x28: ffffffc0090aff5c 
[   68.275310] x27: ffffffc0114a084a x26: 0000000000000020 
[   68.275313] x25: ffffffc0090aff5c x24: 00000000000003d0 
[   68.275316] x23: 00000000ffffffd0 x22: ffffffc010b45070 
[   68.275319] x21: ffffffc01200b780 x20: ffffffc01200b490 
[   68.275323] x19: 0000000044ae0280 x18: 0000000000000001 
[   68.275326] x17: ffffff8041d7f0a0 x16: 0000000000000000 
[   68.275329] x15: ffffff804360e020 x14: ffffffffffffffff 
[   68.275332] x13: ffffff807fb8b8e8 x12: ffffff807fb8993b 
[   68.275335] x11: 0000000005f5e100 x10: abcc77118461cefd 
[   68.275338] x9 : ffffffc0100e95dc x8 : 000000000000000b 
[   68.275341] x7 : 0000000000000064 x6 : ffffff807fb8994f 
[   68.275344] x5 : 00000000fffffff2 x4 : ffffff807fb898f8 
[   68.275347] x3 : 0000000000000000 x2 : 0000000000000023 
[   68.275350] x1 : 0000000000000000 x0 : 0000000000000000 
[   68.275353] Call trace:
[   68.275354]  set_field_width+0x94/0xa0
[   68.275355]  vsnprintf+0x1b8/0x730
[   68.275357]  vscnprintf+0x30/0x68
[   68.275358]  vprintk_store+0x78/0x230
[   68.275359]  vprintk_emit+0xfc/0x328
[   68.275361]  vprintk_default+0x40/0x50
[   68.275362]  vprintk_func+0xfc/0x308
[   68.275363]  printk+0x68/0x90
[   68.275364]  radeon_get_bios+0xdcc/0xed0 [radeon]
[   68.275366]  evergreen_init+0x20/0x358 [radeon]
[   68.275367]  radeon_device_init+0x4c4/0xa68 [radeon]
[   68.275369]  radeon_driver_load_kms+0x74/0x170 [radeon]
[   68.275370]  drm_dev_register+0xe8/0x220 [drm]
[   68.275371]  radeon_pci_probe+0x118/0x188 [radeon]
[   68.275373]  pci_device_probe+0xc0/0x190
[   68.275374]  really_probe+0xf0/0x4d0
[   68.275375]  driver_probe_device+0xfc/0x168
[   68.275376]  device_driver_attach+0x7c/0x88
[   68.275378]  __driver_attach+0xac/0x178
[   68.275379]  bus_for_each_dev+0x78/0xd0
[   68.275380]  driver_attach+0x2c/0x38
[   68.275381]  bus_add_driver+0x14c/0x230
[   68.275383]  driver_register+0x6c/0x128
[   68.275384]  __pci_register_driver+0x4c/0x58
[   68.275385]  radeon_init+0x84/0x1000 [radeon]
[   68.275387]  do_one_initcall+0x4c/0x2d0
[   68.275388]  do_init_module+0x60/0x248
[   68.275389]  load_module+0x2204/0x2970
[   68.275391]  __do_sys_finit_module+0xbc/0x128
[   68.275392]  __arm64_sys_finit_module+0x28/0x38
[   68.275393]  el0_svc_common.constprop.0+0x84/0x1e8
[   68.275395]  do_el0_svc+0x2c/0x98
[   68.275396]  el0_svc+0x20/0x30
[   68.275397]  el0_sync_handler+0xb0/0xb8
[   68.275398]  el0_sync+0x174/0x180
[   68.275400] ---[ end trace 240e9a4e757dda1f ]---
[   68.275471] first 64 bytes after BIOS header start: 24 00 01 01 41 54 4f 4d 00 c0 c3 03 75 01 29 02 cf 00 22 04 00 00 00 00 4b 17 04 a0 04 02 aa a7 4e a8 a0 00 50 43 49 52 02 10 79 67 00 00 18 00 00 00 00 03 80 00 0c 0d 00 00 00 00 41 4d 44 20
[   68.275476] ATOM magic true
[   68.275482] testing if BIOS is AtomBIOS
[   68.275487] Made it past AtomBIOS check
[   68.275551] [drm:radeon_atombios_init [radeon]] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO
[   68.275566] ATOM BIOS: R5
[   68.275649] [drm] GPU not posted. posting now...
[   68.278927] radeon 0000:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)
[   68.278936] radeon 0000:01:00.0: GTT: 1024M 0x0000000080000000 - 0x00000000BFFFFFFF
[   68.278942] [drm] Detected VRAM RAM=2048M, BAR=256M
[   68.278946] [drm] RAM width 64bits DDR
[   68.279144] [TTM] Zone  kernel: Available graphics memory: 947156 KiB
[   68.279150] [TTM] Initializing pool allocator
[   68.279170] [TTM] Initializing DMA pool allocator
[   68.279238] [drm] radeon: 2048M of VRAM memory ready
[   68.279248] [drm] radeon: 1024M of GTT memory ready.
[   68.279286] [drm] Loading CAICOS Microcode
[   68.284583] [drm] Internal thermal controller without fan control
[   68.289796] [drm] radeon: dpm initialized
[   68.295016] [drm] GART: num cpu pages 262144, num gpu pages 262144
[   68.297634] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0

The crash probably happens somewhere in my code that replaces the bios in memory with the rom.

    kfree(rdev->bios);
    printk("freed old BIOS buffer");
    rdev->bios = kzalloc(__222524_rom_len, GFP_KERNEL);
    printk("allocated new BIOS buffer for image");
    memcpy(rdev->bios,&__222524_rom,__222524_rom_len);
    printk("copied BIOS image into buffer");

6by9 commented 3 years ago

You had noticed that %*ph says up to 64 bytes long https://elixir.bootlin.com/linux/latest/source/Documentation/core-api/printk-formats.rst#L257

field width 1152254592 too large is just a warning. The kernel clamps the maximum size - https://elixir.bootlin.com/linux/latest/source/lib/vsprintf.c#L2507

I'd suspect that you're running quite a long way past that WARN, and it's initialised a load more stuff before blowing up. Where exactly have you replaced the BIOS?

Coreforge commented 3 years ago

The printk was the issue. Removed that and now that warning is gone. I'm loading the rom directly after this line: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/radeon/radeon_bios.c#L689 which should be directly after it was read (I could probably take the reading part out as it just gets replaced anyways). It still locks up though.

Coreforge commented 3 years ago

The same thing happens with the bios I dumped with my pc, so that's not the problem anymore. There's nothing happening at the HDMI output if the card, so the issue probably isn't that I don't have a monitor connected. Enabling DRM debug messages would probably be a good way to get an idea where it hangs, but I haven't been able to do that. If I could get the debugger working that could help too, but so far, I haven't gotten it to work.

6by9 commented 3 years ago

Drm debug is easiest to do by adding drm.debug=0x3f to /boot/cmdline.txt (don't add an carriage returns). I suspect it'll be blowing up within the probe/bind though, so that may not be so useful as it is framework level logging.

Coreforge commented 3 years ago

Drm debug didn't have any useful information. I've tracked down where it hangs using more printk, so it might not be totally accurate, but I've placed msleep(100) after the printks so that there's a bit more time for the data to get out. It seems to get stuck on this line https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/radeon/evergreen.c#L5057 The output I get from dmesg is:

[   57.331338] radeon 0000:01:00.0: WB enabled
[   57.331346] radeon_wb_init done. r=0
[   57.351303] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00
[   57.351311] radeon_fence_driver_ring GFX done. r=0
[   57.459306] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c

Interestingly, these only got out after i added more printks (I added the second and fourth one), so it's probably not actually in this line, but somewhere later. My guess is the IRQ stuff gets stuck somewhere.

6by9 commented 3 years ago

Are you remembering to add a \n to the end of all your printk lines? The console only flushes on a \n

Coreforge commented 3 years ago

I didn't on all. Some after the IRQ section had it, so it didn't get past that, but I didn't have it on some in that section.

Coreforge commented 3 years ago

It's now getting to evergreen_uvd_start. Somewhere in there it apparently gets stuck, but I still suspect interrupts to be the issue.

Coreforge commented 3 years ago

So this is interesting, I just accidentally made evergreen_uvd_start return nearly immediatly (those misleadingly indented ifs, just noticed it now), and the module loaded successfully, though it gave another error. Gonna have to try out a screen attached now.

[   60.024775] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD)
[   60.024783] DEBUG: Passed evergreen_startup 5165 
[   60.135309] evergreen_startup done. r=-22
[   60.155310] radeon 0000:01:00.0: disabling GPU acceleration
[   60.175728] r600_dma_fini done
[   60.196378] r600_irq_fini done
[   60.215337] radeon_ib_pool_fini done
[   60.235449] radeon_irq_kms_fin done
[   60.256936] evergreen_pcie_gart_fini done
[   60.275308] acceleration disabled
[   60.295317] acceleration stuff done
[   60.315308] evergreen_init done
[   60.337017] [drm] Radeon Display Connectors
[   60.337031] [drm] Connector 0:
[   60.337035] [drm]   HDMI-A-1
[   60.337039] [drm]   HPD2
[   60.337046] [drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c
[   60.337050] [drm]   Encoders:
[   60.337054] [drm]     DFP1: INTERNAL_UNIPHY1
[   60.337059] [drm] Connector 1:
[   60.337063] [drm]   DVI-D-1
[   60.337067] [drm]   HPD4
[   60.337072] [drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c
[   60.337076] [drm]   Encoders:
[   60.337081] [drm]     DFP2: INTERNAL_UNIPHY
[   60.337085] [drm] Connector 2:
[   60.337089] [drm]   VGA-1
[   60.337094] [drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[   60.337098] [drm]   Encoders:
[   60.337103] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[   60.367699] radeon 0000:01:00.0: [drm] Cannot find any crtc or sizes
[   60.368364] [drm] Initialized radeon 2.50.0 20080528 for 0000:01:00.0 on minor 2
[   61.395677] radeon 0000:01:00.0: [drm] Cannot find any crtc or sizes

Edit: With a display connected to either HDMI or VGA, it just hangs and the last 3 lines are missing, probably because it can get size information from the display.

Coreforge commented 3 years ago

DVI reacts a little bit differently. The dmesg output is the same and the pi still hangs, but it wakes the monitor up shortly. Since the crtc/sizes error is from drm, I don't know if it hangs in the radeon driver or somewhere in drm.

Coreforge commented 3 years ago

It seems to get stuck in this function: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/radeon/radeon_fb.c#L330 So something goes wrong setting up the framebuffer, which explains why it doesn't hang when no display is connected. I'll leave the UVP stuff disabled for now as that is just for video decoding and shouldn't be necessary to get the card working.

Coreforge commented 3 years ago

I got a bit further with kgdb (but not much). If I have the console and kgdb on ttyAMA0 in cmdline.txt, the pi hangs after the 2nd stage bootloader. If I use different ports for each (ttyAMA0 for console, ttyAMA1 for kgdboc), it boots up, or waits for the debugger if I add kgdbwait. The IO driver registers fine, but if I try to connect with gdb from a second pi, it just times out. I set the baudrate to 115200 both in cmdline and gdb, so that shouldn't be the issue. If I swap console and kgdboc, the console works fine, so the wiring is correct. I've tried to go back to pretty much stock image and kernel (with debug info), but that didn't change anything. ttyAMA0 isn't taken up by the wifi module, I don't have one.

Coreforge commented 3 years ago

I enabled VGA again as disabling it just caused it to hang when iterating over the modes for the HDMI monitor, so now it does that successfully, but hangs at VGA again. If I don't have a monitor connected to VGA, it hangs in radeon_vga_detect at load detection. If I connect a monitor, it does the same thing as HDMI and hangs at the end of the loop (I don't know if it would exit the loop or continue as It iterates over pointers and doesn't just increment up to e set value). If I disable destructive probing so it doesn't do load detection, it locks up like it does with VGA completely disabled. Since I still haven't gotten the debugger to work, it's really annoying to find where it locks up due to it locking up in callback functions which can be annoying to find. KGDB triggers at Oops' though, so it's present, it just doesn't respond on a serial port. I have an RX460, so I might try some stuff with that instead as it should work better with the newer driver. Since I use that card though I won't do as much testing.

6by9 commented 3 years ago

I've spent a few pennies and picked up an XFX HD6450 HD-645X-ZQ 1GB card. Same situation as you were in

[   38.647578] [drm:radeon_device_init [radeon]] *ERROR* Unable to find PCI I/O BAR
[   38.679665] radeon 0000:01:00.0: Expecting atombios for evergreen GPU
[   38.679674] radeon 0000:01:00.0: Fatal error during GPU init
[   38.679681] [drm] radeon: finishing device.
[   38.679688] [TTM] Memory type 2 has not been initialized
[   38.687155] radeon: probe of 0000:01:00.0 failed with error -22

Remove that one and I get it adds

[  122.837227] [drm:radeon_atombios_init [radeon]] *ERROR* Unable to find PCI I/O BAR; using MMIO for ATOM IIO
[  122.837240] Invalid ATI magic

which is understandable. https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/radeon/atom.c#L1287

@Coreforge Could you post your captured BIOS somewhere? Otherwise I'll just have to stuff this card in an x86 machine and grab it for myself. I don't know if it is card specific or not.

valpackett commented 3 years ago

Has anyone tried dumping the vbios both on a working machine and on the pi, and comparing it? You need to examine how the read data was corrupted. If you just try to patch stuff in like the hardcoded vbios, the corruption will pop up again and again in the next thing the driver does.

6by9 commented 3 years ago

I do have a suspicion that it's the MMIO read functions used when there's no I/O BAR that may be at fault. https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/radeon/radeon_device.c#L990 It'd be interesting to force the driver through that path on an x86 box to see what happens.

There also appears to be a total dependency on ACPI in https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/radeon/radeon_bios.c#L188, which is going to be an issue on a DT-based ARM system.

Unfortunately I haven't got huge amounts of time to spend playing with this sort of stuff :-(

valpackett commented 3 years ago

There also appears to be a total dependency on ACPI

No, it's just the first alternative of a bunch of methods it tries: https://elixir.bootlin.com/linux/v5.10.4/source/drivers/gpu/drm/radeon/radeon_bios.c#L674

Most systems do not have anything vbios related in ACPI. radeon_read_bios would be the most commonly used one. This one seems to be only for laptops with discrete GPUs (as it has a check against integrated GPUs).

MMIO read functions used when there's no I/O BAR that may be at fault

It is odd that there's no I/O BAR (again, nothing x86 specific about these, e.g. my MACCHIATObin has no problems with these) but that's not how vbios is read. IIUC, PCI ROM is always assumed to be in MMIO space, not legacy I/O register space. And it's specifically completely unrelated to that warning: radeon_read_bios does not touch rdev->rio_mem or any atom_card_info stuff — note that that stuff is for the ATOM interpreter that would run whatever we read from the ROM.

Someone really needs to examine in what way is exactly the read data corrupted. The vbios passes some magic number checks at first before it gets to "is this atombios".

6by9 commented 3 years ago

Fair enough - misread the code.

I'm busy for the next few days, but will try to get both an x86 and the Pi working with this card (on similar kernel builds) to see the differences in what and how it reads information.

Coreforge commented 3 years ago

I tried dumping the bios from the pi but I haven't found a good way to do it, as it's 128kb, so just printing it is out, but file IO seems to be not ideal with kernel modules. I dumped the first 64 bytes of the header though. 00 00 01 01 41 54 01 01 41 54 c3 03 74 01 c3 03 74 01 22 04 00 00 22 04 00 00 04 a0 04 02 04 a0 04 02 a0 00 00 00 a0 00 00 00 02 10 79 67 02 10 79 67 00 00 00 03 00 00 00 03 00 00 00 00 00 00 The magic starts at byte 5, and the AT part is there, but the OM is just 0x01 0x01 instead, with the AT again after it. The correct result should be 24 00 01 01 41 54 4F 4D 00 C0 C3 03 74 01 29 02 CE 00 22 04 00 00 00 00 4B 17 04 A0 04 02 AA A7 4E A8 A0 00 00 00 50 43 49 52 02 10 79 67 00 00 18 00 00 00 00 03 80 00 0C 0D 00 00 00 00 41 4D if I got the address right, but it seems to match up for the most part. My bios dump won't help you much as it's for a 2gb card and would likely cause issues with a 1gb one. This one should match for the most part though: https://www.techpowerup.com/vgabios/103804/xfx-hd6450-1024-110609 Otherwise dumping isn't too big of a deal either. The BIOS shouldn't be too card specific, but clocks, voltages, memory size and chip should match. a 1gb BIOS might work on a 2gb card, but I doubt the other ways around.

pelwell commented 3 years ago

We know that the RC in 2711 can't do 64-bit accesses, and the repetition in the above dump shows the same pattern as you get from 64-bit reads.

Coreforge commented 3 years ago

I tried debugging with OpenOCD today with not much more success than I had over serial. I'm using a pi3 as a probe with the bcm2835gpio driver.

The script for OpenOCD is

adapter driver bcm2835gpio
bcm2835gpio_jtag_nums 11 25 10 9
bcm2835gpio_trst_num 7
reset_config trst_only
transport select jtag

set _CHIPNAME bcm2711
set _DAP_TAPID 0x4ba00477

adapter_khz 10

#transport select jtag
#reset_config trst_and_srst

telnet_port 4444

# create tap
jtag newtap auto0 tap -irlen 4 -expected-id $_DAP_TAPID

# create dap
dap create auto0.dap -chain-position auto0.tap

set CTIBASE {0x80420000 0x80520000 0x80620000 0x80720000}
set DBGBASE {0x80410000 0x80510000 0x80610000 0x80710000}

set _cores 4

set _TARGETNAME $_CHIPNAME.a72
set _CTINAME $_CHIPNAME.cti
set _smp_command ""

for {set _core 0} {$_core < $_cores} { incr _core} {
    cti create $_CTINAME.$_core -dap auto0.dap -ap-num 0 -ctibase [lindex $CTIBASE $_core]

    set _command "target create ${_TARGETNAME}.$_core aarch64 \
                    -dap auto0.dap  -dbgbase [lindex $DBGBASE $_core] \
                    -coreid $_core -cti $_CTINAME.$_core"
    if {$_core != 0} {
        set _smp_command "$_smp_command $_TARGETNAME.$_core"
    } else {
        set _smp_command "target smp $_TARGETNAME.$_core"
    }

    eval $_command
}

eval $_smp_command
targets $_TARGETNAME.0

and the output from OpenOCD is

Open On-Chip Debugger 0.11.0-rc1+dev-00010-gc69b4deae-dirty (2021-01-06-03:07)
Licensed under GNU GPL v2
For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
DEPRECATED! use 'adapter speed' not 'adapter_khz'
Warn : DEPRECATED! use '-baseaddr' not '-ctibase'
Warn : DEPRECATED! use '-baseaddr' not '-ctibase'
Warn : DEPRECATED! use '-baseaddr' not '-ctibase'
Warn : DEPRECATED! use '-baseaddr' not '-ctibase'
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : BCM2835 GPIO JTAG/SWD bitbang driver
Info : clock speed 10 kHz
Error: JTAG scan chain interrogation failed: all ones
Error: Check JTAG interface, timings, target power, etc.
Error: Trying to use configured scan chain anyway...
Error: auto0.tap: IR capture error; saw 0x0f not 0x01
Warn : Bypassing JTAG setup events due to errors
Error: Invalid ACK (7) in DAP response
Error: JTAG-DP STICKY ERROR

It's probably something relatively simple, but I'm not familiar with the whole JTAG thing. The script is essentially this one plus adapter config: https://gist.github.com/tnishinaga/46a3380e1f47f5e892bbb74e55b3cf3e

Coreforge commented 3 years ago

Got JTAG working now. I I had to additionally set GPIO22 to alt4 in config.txt and tie it to 3.3v with a jumper. If I can now get it to cooperate with eclipse, debugging this should be a lot easier.

Coreforge commented 3 years ago

I got a bit further stepping through with gdb, but only once. It makes sense that the printks don't all get out before the pi hangs up, but it's weird that it locks up at different points if I step through it with the debugger.I once got it up to setting up the framebuffer where it hung, but it usually locks up before. For some reason the line information is missing for some functions, so I can't debug them as easily. When the pi locks up, the jtag interface also seems to not fully work anymore, as OpenOCD prints multiple "target not halted" errors if I try halting it. I might try disabling optimization to see if that fixes the issues with no line information being present, and it might also get it further as I'd guess it's some kind of timing issue since it gets further if I single step everything.

Coreforge commented 3 years ago

I managed to get to this line again where it locks up: https://elixir.bootlin.com/linux/v5.10.2/source/drivers/gpu/drm/radeon/radeon_fb.c#L98

I did this by setting a breakpoint at the function drm_helper_probe_detect (the line information is missing for this one) and continuing the first time it gets hit to get to vga. After that, I single step to the second breakpoint at radeon_detect_vga (it might lock up if I just continue). When it hits that breakpoint, I step through the function with nexti and through drm_client_modeset_probe after it returns. I then step through __drm_fb_helper_initial_config_and_unlock which eventually gets to radeonfb_create which calls radeon_align_pitch where it locks up. I do the single stepping for two reasons:

The pi otherwise locks up earlier (it's not just printks not getting out, if I step through it too differently, it locks up somewhere)
Once it locks up, the JTAG interface doesn't work anymore so I can't just backtrace. When stepping (either manually or letting gdb do it when there is no line information), I can get the address where it last was.

I'll have to look into why it gets further when stepping, but for now, I'll try to get it further than it is currently. If anyone wants the gdb output: https://gist.github.com/Coreforge/6236bcbcae4ed3e2cff7c13befcdc756 The line numbers are probably not accurate due to the printks I added. The missing line numbers are probably due to some of the source files being newer than the binaries, but I haven't really changed anything. I'll keep it this way for now though as it helps with stepping through (I tried with just si 500 which didn't work).

geerlingguy commented 3 years ago

@Coreforge - Thanks for picking up the ball on the debugging! I've had to drop on work on this lately as I just had a baby and am starting some work with a new company this month, but I do hope to get back to testing, also with Nouveau on Nvidia cards. It seems like we're all bumping into some similar issues, and that entire lockup smells to me like some sort of memory access issue.

Maybe a bug in the Broadcom PCI-E driver?

Coreforge commented 3 years ago

I got something working. I noticed a bit ago that if I disable fbdev emulation, the module loads fine, but the screen doesn't get initialized or detected at all. I remembered that linux doesn't always play nice with hotplugging monitors, and since I had the module blacklisted and loaded it afterwards, that might have caused issues. If I don't blacklist the module and have the monitor plugged in from the beginning, it does detect the monitor and display noise. I'd guess it just shows noise because there's nothing in the framebuffer because fbdev emulation is disabled. It could also be due to ring 0 not working as this error is still there [ 16.513993] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD) I'll see if I can access the framebuffer to write some test data, otherwise I'll have to look into how the fbdev gets set up. This also explains why it got further when I was stepping through with the debugger, as systemd-udevd always got stopped by the watchdog. The driver gets past the setup of the fbdev though, so it could also be that accessing it doesn't work properly and results in it hanging.

Edit: The pi still locks up, but acts differently over jtag. I can still connect to it, but I can't really get any useful data from it (gdb complains about the architecture being arm instead of aarch64, so it probably gets reset, and if I connect without having vmlinux loaded, it says the program isn't running). I could add a delay so I have more time to connect with the debugger and see what's going on, but I have no idea where to look. I also couldn't find anything to easily access the framebuffer from looking over the driver (I'm not too familliar with it though). There's nothing useful on the console either (a bunch of IOCTLs from Xorg, the last three being DRM_IOCTL_GET_CAP and two times DRM_IOCTL_MODE_LIST_LESSES).

Coreforge commented 3 years ago

The issue seems to be caused in some way by X11. If I disable it, it doesn't hang up (but I don't get anything on the screen either, no console, doesn't detect it at all). It always hang after the last DRM_IOCTL_MODE_LIST_LESSEES call, which works properly though. I'll have to look into X11 initialization to see where it hangs up, as I don't have anything else useful to trace. I got part of an IOCTL call after the last one once, but the name was missing, so nothing useful there. I also got a different pattern if I connected a second monitor to VGA, and the left column of pixels is purple. Not really useful, but kinda interesting.

Coreforge commented 3 years ago

I found one place where it locked up in Xorg. When accessing r600_shadow_fb in this line https://github.com/freedesktop/xorg-xf86-video-ati/blob/master/src/radeon_kms.c#L2632 the pi just locks up completely (if I add a debug statement earlier in the function it locks up at that statement). To get around this I've replaced it with just if(1) for now to get further. It now crashes with /usr/local/bin/Xorg: symbol lookup error: /usr/local/lib/xorg/modules/drivers/radeon_drv.so: undefined symbol: exaMoveInPixmap, which I'm not really sure why I get this error, as exaMoveInPixmap is defined in exa.c. I've compiled xf86-video-ati and xserver both directly on the pi. exaMoveInPixmap gets called by drmmode_create_bo_pixmap, which gets called by drmmode_crtc_scanout_create, which gets called by RADEONLeaveVT_KMS, so this kinda explains the image I got when the pi locked up, as it was setting up framebuffer stuff.

PixlRainbow commented 3 years ago

undefined symbol can also occur even if the function is present, if the signature (e.g. parameters or return type) is different

scarburato commented 3 years ago

undefined symbol can also occur even if the function is present, if the signature (e.g. parameters or return type) is different

In C there's not name mangling, the name of the symbol is the same as the name of the function. Also in C++ the return type is not present in the signature

/usr/local/bin/Xorg: symbol lookup error: /usr/local/lib/xorg/modules/drivers/radeon_drv.so: undefined symbol: exaMoveInPixmap, which I'm not really sure why I get this error, as exaMoveInPixmap is defined in exa.c

if you run objdump -T /usr/local/lib/xorg/modules/drivers/radeon_drv.so , can you see exaMoveInPixmap in the table? If not, can you see any other of the functions defined in exa.h?

Coreforge commented 3 years ago

The function and a few more are there if I grep for them

0000000000000000      D  *UND*  0000000000000000              exaWaitSync
0000000000000000      D  *UND*  0000000000000000              exaDriverAlloc
0000000000000000      D  *UND*  0000000000000000              exaDriverInit
0000000000000000      D  *UND*  0000000000000000              exaDriverFini
0000000000000000      D  *UND*  0000000000000000              exaMoveInPixmap
0000000000000000      D  *UND*  0000000000000000              exaMarkSync
0000000000000000      D  *UND*  0000000000000000              exaGetPixmapPitch
0000000000000000      D  *UND*  0000000000000000              exaGetPixmapDriverPrivate

exa.c is part of xserver though, but I could try to put it into xf86-video-ati to get it compiled in.

scarburato commented 3 years ago

According to the manual, the fact that the first address is 0 and the section is *UND* means that the symbol is not defined in that file (in this case the function's machine code, I suppose).

Maybe the linker is not linking exa.o to the shared object; or maybe the functions in exa.o are in another shared object which is not loaded, in this case you may try to load it with LD_PRELOAD if you find it, or maybe the xserver is hiding the symbol.

On my amd64 machine with Ubuntu 20.04 the exa functions are defined in /usr/lib/xorg/modules/libexa.so. On your compiled version it might be under /usr/local. Is that library present?

This how it looks the table entry for me in libexa.so 0000000000004810 g DF .text 000000000000005f Base exaMoveInPixmap

Coreforge commented 3 years ago

/usr/local/lib/xorg/modules/libexa.so contains exaMoveInPixmap 00000000000043f8 g DF .text 000000000000005c Base exaMoveInPixmap LD_PRELOAD worked, I'm now getting assertion failed in exaMoveInPixmap, so that works.

Coreforge commented 3 years ago

And I'm kinda stuck at this assertion now. exaMoveInPixmap calls dixGetPrivateAddr (over some macros and another inline function), which fails with X: ../include/privates.h:121: dixGetPrivateAddr: Assertionkey->initialized' failed.` If I set key->initialized to 1 with GDB, it errors out later, so I can't just ignore the error here. The issue is that I can't find where the key gets initialized. The type is PRIVATE_XSELINUX, and if I put a breakpoint at dixRegisterPrivateKey, no key of that type gets initialized. exaDriverInit also doesn't get called, which seems problematic to me. The backtrace from dixRegisterPrivateKey is

#0  dixGetPrivateAddr (key=0x7fa3f02310 <exaScreenPrivateKeyRec>, key=0x7fa3f02310 <exaScreenPrivateKeyRec>, privates=0x558e8c9270) at ../include/privates.h:121
#1  dixGetPrivate (key=0x7fa3f02310 <exaScreenPrivateKeyRec>, privates=0x558e8c9270) at ../include/privates.h:136
#2  exaMoveInPixmap (pPixmap=0x558e9955b0) at exa.c:1123
#3  0x0000007fa33633f4 in drmmode_create_bo_pixmap (width=width@entry=1920, height=height@entry=1080, depth=24, bpp=32, pitch=7680, bo=0x558e8ad820, pScrn=<optimised out>, pScrn=<optimised out>) at drmmode_display.c:124
#4  0x0000007fa3364264 in drmmode_crtc_scanout_create (crtc=crtc@entry=0x558e8d08c0, scanout=scanout@entry=0x7fdb3e8008, width=width@entry=1920, height=height@entry=1080) at drmmode_display.c:560
#5  0x0000007fa335d6b8 in RADEONLeaveVT_KMS (pScrn=0x558e8c96a0) at radeon_kms.c:2717
#6  0x0000005583e4b694 in ddxGiveUp (error=error@entry=EXIT_ERR_ABORT) at xf86Init.c:820
#7  0x0000005583f5e0b0 in AbortServer () at log.c:883
#8  0x0000005583f5edb8 in FatalError (f=f@entry=0x5583f65bc0 "Failed to activate virtual core keyboard: %d") at log.c:1024
#9  0x0000005583e08828 in InitCoreDevices () at devices.c:724
#10 0x0000005583e13438 in dix_main (argc=3, argv=0x7fdb3e83c8, envp=<optimised out>) at main.c:245
#11 0x0000007fa3a9fd24 in __libc_start_main () from target:/lib/aarch64-linux-gnu/libc.so.6
#12 0x0000005583dfddb8 in _start () at xf86DGA.c:2038
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

The Error is just from keyboard stuff, the server just continues after that..

scarburato commented 3 years ago

I have 2 questions unrelated to Xorg: Does the fbdev work? If so if you run cat /proc/fb can you see it? Did you tried to start a tty on the gpu by modify the fbcon=map option or by writing it in sys/class/vtconsole/vtcon1/bind as described in the kernel docs?

Coreforge commented 3 years ago

fbdev does not work (I disabled fbdev emulation as that caused the pi to hang up completely. I might look into that again if I don't get further with Xorg). I'll try to start a tty.

Coreforge commented 3 years ago

Looks like it locks up at memset_io here https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/radeon/radeon_fb.c#L264 Stepping through with GDB, the last function I get is radeon_bo_size until I get target not halted in OpenOCD. If pause in GDB afterwards, I get

0xffffffc0100204d8 in __memset_io () at arch/arm64/kernel/io.c:63
63          count--;

so I'll have to look into this memset_io.

Coreforge commented 3 years ago

The issue there was that memset was used instead of memset_io. Here, there already is memset_io being used. I'll try limiting it to 63 bytes though to se if it helps.

Coreforge commented 3 years ago

I tried skipping the memset_io as it seems to just fill the framebuffer with zeros, but at line 271 it fails to set info->screen_base, so that just stays at 0x0. rbo->kptr is a valid pointer. If I set screen_base from within gdb, it becomes unstable, and if I just continue, I get a kernel NULL pointer dereference and a Segfault. I'll comment out the memset_io and recompile it to see if that helps with this, but I don't really see why it would.

Message from syslogd@raspberrypi at Jan 23 22:33:35 ...
 kernel:[  419.610919] watchdog: BUG: soft lockup - CPU#0 stuck for 98s! [modprobe:621]
[  419.621725] DEBUG: Passed drm_fb_helper_fill_info 1756
[  419.623343] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048
[  419.626611] Mem abort info:
[  419.626984]   ESR = 0x96000005
[  419.627387]   EC = 0x25: DABT (current EL), IL = 32 bits
[  419.628079]   SET = 0, FnV = 0
[  419.628480]   EA = 0, S1PTW = 0
[  419.628892] Data abort info:
[  419.629271]   ISV = 0, ISS = 0x00000005
[  419.630962]   CM = 0, WnR = 0
[  419.631357] user pgtable: 4k pages, 39-bit VAs, pgdp=000000004466d000
[  419.632195] [0000000000000048] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[  419.633926] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[  419.634651] Modules linked in: radeon(+) ttm i2c_algo_bit sha256_generic cfg80211 vc4 rfkill cec 8021q garp stp llc drm_kms_helper v3d gpu_sched drm bcm2835_codec(C) bcm2835_v4l2(C) bcm2835_isp(C) v4l2_mem2mem bcm2835_mmal_vchiq(C) videobuf2_vmalloc snd_soc_core videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common drm_panel_orientation_quirks raspberrypi_hwmon snd_compress snd_pcm_dmaengine videodev snd_pcm snd_timer mc vc_sm_cma(C) snd syscopyarea sysfillrect rpivid_mem sysimgblt fb_sys_fops backlight uio_pdrv_genirq uio squashfs i2c_dev ip_tables x_tables ipv6
[  419.641413] CPU: 0 PID: 621 Comm: modprobe Tainted: G         C   L    5.10.2-v8+ #8
[  419.642415] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[  419.643252] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[  419.644075] pc : drm_fb_helper_fill_info+0x60/0x16c [drm_kms_helper]
[  419.644926] lr : drm_fb_helper_fill_info+0x50/0x16c [drm_kms_helper]
[  419.645750] sp : ffffffc011f5b5b0
[  419.646181] x29: ffffffc011f5b5b0 x28: ffffff8049adb030
[  419.646876] x27: ffffff8049adb148 x26: ffffff8049adac78
[  419.647570] x25: ffffff8049adac78 x24: ffffff8042ad1000
[  419.648264] x23: ffffffc011f5b780 x22: ffffffc008e20bd8
[  419.648958] x21: ffffffc008e274c8 x20: ffffff8049adb000
[  419.649652] x19: ffffff8042ad1000 x18: 0000000000000000
[  419.650345] x17: 0000000000000000 x16: 0000000000000000
[  419.651039] x15: 0000000000000000 x14: 0000000000000000
[  419.651732] x13: 0000000000000000 x12: 0000000000000000
[  419.652425] x11: ffffffff014a5840 x10: 00000000000019e0
[  419.653119] x9 : ffffffc0100e5f38 x8 : ffffff80495e1a40
[  419.653812] x7 : 000000005a961000 x6 : 0000000000000000
[  419.654506] x5 : 0000000000000000 x4 : ffffff807fb678c8
[  419.655199] x3 : 0000000000000000 x2 : 00000000000006a4
[  419.655893] x1 : ffffffc008e20e88 x0 : ffffffc008e274c8
[  419.656587] Call trace:
[  419.656935]  drm_fb_helper_fill_info+0x60/0x16c [drm_kms_helper]
[  419.657820]  radeonfb_create+0x2ac/0x460 [radeon]
[  419.658460]  __drm_fb_helper_initial_config_and_unlock+0x4f4/0x74c [drm_kms_helper]
[  419.659478]  drm_fb_helper_initial_config+0x78/0xa0 [drm_kms_helper]
[  419.660375]  radeon_fbdev_init+0x1a0/0x210 [radeon]
[  419.661082]  radeon_modeset_init+0x744/0x920 [radeon]
[  419.661810]  radeon_driver_load_kms+0x84/0x170 [radeon]
[  419.662560]  drm_dev_register+0xe8/0x220 [drm]
[  419.663210]  radeon_pci_probe+0x118/0x188 [radeon]
[  419.663841]  pci_device_probe+0xc0/0x190
[  419.664356]  really_probe+0xf0/0x4d0
[  419.664822]  driver_probe_device+0xfc/0x168
[  419.665368]  device_driver_attach+0x7c/0x88
[  419.665913]  __driver_attach+0xac/0x178
[  419.666413]  bus_for_each_dev+0x78/0xd0
[  419.666914]  driver_attach+0x2c/0x38
[  419.667379]  bus_add_driver+0x14c/0x230
[  419.667880]  driver_register+0x6c/0x128
[  419.668381]  __pci_register_driver+0x4c/0x58
[  419.669008]  radeon_init+0x84/0x1000 [radeon]
[  419.669577]  do_one_initcall+0x4c/0x2d0
[  419.670080]  do_init_module+0x60/0x248
[  419.670571]  load_module+0x2204/0x2970
[  419.671060]  __do_sys_finit_module+0xbc/0x128
[  419.671629]  __arm64_sys_finit_module+0x28/0x38
[  419.672220]  el0_svc_common.constprop.0+0x84/0x1e8
[  419.672844]  do_el0_svc+0x2c/0x98
[  419.673279]  el0_svc+0x20/0x30
[  419.673678]  el0_sync_handler+0xb0/0xb8
[  419.674179]  el0_sync+0x174/0x180
[  419.674616] Code: f9403283 910ac2c1 aa1503e0 5280d482 (f9402464)
[  419.675409] ---[ end trace 18ddac55386015c2 ]---

Message from syslogd@raspberrypi at Jan 23 22:33:35 ...
 kernel:[  419.633926] Internal error: Oops: 96000005 [#1] PREEMPT SMP

Message from syslogd@raspberrypi at Jan 23 22:33:35 ...
 kernel:[  419.674616] Code: f9403283 910ac2c1 aa1503e0 5280d482 (f9402464)
Segmentation fault
pi@raspberrypi:~$

Coreforge commented 3 years ago

Got further if I comment out the memset_io instead of jumping over it, and it hangs up with a noise screen like I had before. I think I found the issue with the memset though. If it has to set over 8 bytes, it does it in chunks of 8 using str. If there are less than 8 bytes left or the alignment doesn't fit, it does it byte by byte, so I replaced the single memset_io with a loop that sets 4 bytes at a time so that it gets done with single byte operations. I'm not sure yet where it locks up as it gets a good bit further. Also, the pattern on the screen changes depending on the value I set the buffer to. (I don't have a capture card so this is the best I can do for now)

Coreforge commented 3 years ago

Removing the 8 byte operations from io.c seems to atleast help with reading the BIOS (I haven't tried using the read bios instead if the image, but the first 64 bytes look ok without that repetition pattern). It crashes earlier than it does if I don't remove the 8 byte operations though, so I'll keep them in. It shouldn't be difficult to just put a loop of single byte reads into the read_bios function.

geerlingguy commented 3 years ago

@Coreforge - Thanks so much for documenting your journey! I'm still on pause doing any more testing with the GPUs on my own Pi so I can get caught up on some other things, but seeing that you actually have some sort of data making it through the pipeline to a screen is about 100x further than I'd gotten so far :D

Coreforge commented 3 years ago

I managed to get a console on the screen now (sort of). It's still only every fourth column, the rest is garbage, so you can only see the outlines where the text should be, but there is a blinking cursor. I tried to write each byte individually instead of 4 at a time, but that didn't make a difference (which makes sense since memset_io does that internally anyways). One issue was in cfb_fillrect as that uses __raw_writeq, so I replaced that with __raw_writel for now (that shouldn't cause these kinds of lines as every second pixel should be missing instead of only every fourth being written. I'll properly replace it though as this was just a quick work around). The other issue was cfb_imageblit, which only uses writel, but it locks up unless I put a short delay in the loop in fast_imageblit32 (I just put a printk. If I put a longer delay like 2ms, I can see it filling the screen if I put it to all white before, or every fourth pixel actually). It's a bit hard to see, but you can see the black missing. I might try 32bit raspberry pi os again to see if that changes anything, but I'm not too sure it will. Another option would be to read back the frame buffer and check if it is what it's supposed to be.

volkertb commented 3 years ago

Wow! Great progress, man! :smiley: Not only did you stop it from locking up, it's actually displaying something now. That's a huge step towards having it work correctly.

Am I mistaken, or is this the historic moment of the first PCIe graphics card on a Raspberry Pi 4 showing any video output at all?

Thank you (and everyone else here) for your efforts!

Coreforge commented 3 years ago

Since I'm kinda out of ideas why only every fourth pixel gets data (I tried splitting up 64bit writes into two 32bit writes which didn't change anything, I have to put something between those writes though or it just hangs), I tried again to see if I can get Xorg working now. It now uses glamor instead of exa for acceleration, so that's something. It still crashes though with a segmentation fault. The Xorg output is here: https://gist.github.com/Coreforge/0aa78923cebc426c019a6ba174fb9f79

In dmesg I get a couple errors about GART, so I'll have to look into that.

[  191.653925] [drm:drm_ioctl [drm]] comm="Xorg" pid=751, dev=0xe202, auth=1, RADEON_GEM_CREATE
[  191.655181] ------------[ cut here ]------------
[  191.655187] trying to bind memory to uninitialized GART !
[  191.655367] WARNING: CPU: 3 PID: 751 at drivers/gpu/drm/radeon/radeon_gart.c:297 radeon_gart_bind+0xf8/0x108 [radeon]
[  191.655719] Modules linked in: radeon ttm i2c_algo_bit sha256_generic cfg80211 rfkill 8021q garp stp llc vc4 cec v3d drm_kms_helper gpu_sched bcm2835_codec(C) drm bcm2835_isp(C) bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) v4l2_mem2mem videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common videodev snd_soc_core dwc2 drm_panel_orientation_quirks raspberrypi_hwmon snd_compress snd_pcm_dmaengine roles mc snd_pcm vc_sm_cma(C) snd_timer snd syscopyarea sysfillrect rpivid_mem sysimgblt fb_sys_fops backlight uio_pdrv_genirq uio squashfs i2c_dev ip_tables x_tables ipv6
[  191.663036] CPU: 3 PID: 751 Comm: Xorg Tainted: G     U  WC        5.10.2-v8+ #36
[  191.663039] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[  191.663044] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[  191.663095] pc : radeon_gart_bind+0xf8/0x108 [radeon]
[  191.663244] lr : radeon_gart_bind+0xf8/0x108 [radeon]
[  191.663358] sp : ffffffc011e0b800
[  191.663360] x29: ffffffc011e0b800 x28: ffffffc008fe5200 
[  191.663366] x27: ffffffc011e0baa8 x26: ffffff804370c6e8 
[  191.663372] x25: ffffff804370c700 x24: ffffff8042360000 
[  191.663379] x23: ffffff804b454078 x22: ffffff804370c6e8 
[  191.663384] x21: ffffffc011e0b990 x20: 0000000000000007 
[  191.663391] x19: ffffff804370c000 x18: 0000000000000001 
[  191.663396] x17: 0000000000000000 x16: 0000000000000000 
[  191.663402] x15: ffffff8044860560 x14: 45475f4e4f454441 
[  191.663408] x13: 52202c313d687475 x12: 61202c3230326578 
[  191.663414] x11: ffffffc01131e060 x10: ffffffc0112c7d60 
[  191.663420] x9 : ffffffc0100e5f38 x8 : 0000000000000ae0 
[  191.663426] x7 : c00000010000b074 x6 : 0000000000000000 
[  191.663432] x5 : 0000000000000001 x4 : ffffff807fbc78d0 
[  191.663438] x3 : 0000000000000000 x2 : 0000000000000001 
[  191.663540] x1 : 0000000000000000 x0 : 0000000000000000 
[  191.663656] Call trace:
[  191.663693]  radeon_gart_bind+0xf8/0x108 [radeon]
[  191.663729]  radeon_ttm_backend_bind+0x78/0x268 [radeon]
[  191.663764]  radeon_ttm_tt_bind+0x1c/0x30 [radeon]
[  191.663871]  ttm_bo_handle_move_mem+0x308/0x318 [ttm]
[  191.663991]  ttm_bo_validate+0x144/0x158 [ttm]
[  191.664001]  ttm_bo_init_reserved+0x27c/0x358 [ttm]
[  191.664011]  ttm_bo_init+0x58/0xe0 [ttm]
[  191.664047]  radeon_bo_create+0x174/0x250 [radeon]
[  191.664083]  radeon_gem_object_create+0xbc/0x1a0 [radeon]
[  191.664214]  radeon_gem_create_ioctl+0x70/0x168 [radeon]
[  191.664385]  drm_ioctl_kernel+0xcc/0x120 [drm]
[  191.664426]  drm_ioctl+0x33c/0x418 [drm]
[  191.664568]  radeon_drm_ioctl+0x58/0xc0 [radeon]
[  191.664689]  __arm64_sys_ioctl+0xb0/0xf8
[  191.664695]  el0_svc_common.constprop.0+0x84/0x1e8
[  191.664699]  do_el0_svc+0x2c/0x98
[  191.664705]  el0_svc+0x20/0x30
[  191.664708]  el0_sync_handler+0xb0/0xb8
[  191.664712]  el0_sync+0x174/0x180
[  191.664716] ---[ end trace d7384440361571b8 ]---
[  191.664926] [drm:radeon_ttm_backend_bind [radeon]] *ERROR* failed to bind 256 pages at 0x00000000
[  191.665528] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (1048576, 2, 4096, -22)

I noticed that while I have 2048MB of VRAM, the GTT size is 1024MB. I don't know if this means anything, just throwing it out there. I also tried reading back the framebuffer and didn't get anything wrong back.

I got the BIOS working though. I just replaced the memcpy_fromio in radeon_read_bios with a loop that reads one byte at a time (still using memcpy_fromio as it would otherwise lock up. Maybe a nop between reads is enough, I'll have to try that.)

    int pos;
    for(pos = 0;pos < size; pos++){
        memcpy_fromio(rdev->bios+pos,bios+pos,1);
    }

valpackett commented 3 years ago

I just replaced the memcpy_fromio in radeon_read_bios with a loop that reads one byte at a time

Interesting. Can you try removing the while (count >= 8) section from memcpy_fromio itself instead?

geerlingguy / raspberry-pi-pcie-devices

Test GPU (VisionTek Radeon 5450 1GB) #4