geerlingguy / raspberry-pi-pcie-devices

Raspberry Pi PCI Express device compatibility database
http://pipci.jeffgeerling.com
GNU General Public License v3.0
1.52k stars 135 forks source link

Test Pine64 SOQuartz on CM4 boards #336

Open geerlingguy opened 2 years ago

geerlingguy commented 2 years ago

As with the Radxa CM3, I would also like to test Pine64's SOQuartz with some CM4 boards, since it's supposed to be pin-compatible.

DSC04914

DSC04917

@timonsku mentioned the Wiki (linked above) and this dtb artifact are the two best ways to get started with it. I'd like to write up my experience trying to get the thing to boot, and also seeing if it fits and works in a few popular CM4 boards (starting with the official IO Board).

geerlingguy commented 2 years ago

IMG_0107

So I can get this thing to boot... at least somewhat:

  1. I flashed this dtb image to a microSD card using Balena Etcher.
  2. I popped that microSD card and the SOQuartz onto a CM4 IO Board.
  3. I plugged in the board to my Mac via UART.
  4. I plugged my HDMI display into HDMI0.
  5. I plugged in power to the board.

Using screen /dev/tty.usbserial-0001 115200 I got gibberish. Also tried 1500000 (which seemed to be a default for the RK3566?), and 9600, and still got gibberish. But it was getting data, which means something's flowing through.

On the HDMI display, after a few seconds, I see a flashing cursor. So that's impressive, I guess :D

IMG_0106

geerlingguy commented 2 years ago

Also tested UART with a FT232R adapter, but that got nothing at all (not even gibberish) at 1.5 Mbps; see this comment: https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/327#issuecomment-990317145

S199pWa1k9r commented 2 years ago

If you disassemble the used installation image of the SD card into parts and extract the DTB file from it.

dtc -I dtb -O dts rk3566-soquartz-cm4.dtb> rk3566-soquartz-cm4.dts

Convert it to DTS filethen you can see such lines in it:

chosen { stdout-path = "serial2: 1500000n8"; };

This means that serial line 2 is used (from the point of view of RADXA CM3) at a speed of 1500000. If you do not see normal output, there are two possibilities.

  1. You connected the cable not to serial2 but to another serial and therefore you see garbage. The on-board console is the RPI CM4 console. And you need to use the console from the RADXA CM3 point of view.

  2. Your console cable is simply stray (too long) or the FT232R chip is not original.

The original FT232R cable with Radxa QURTZ64A works for me.

I usually use cables based on the CH304B chip, but they have even more fakes. You are doing everything right, you just need to find the error.

Good luck

geerlingguy commented 2 years ago

@S199pWa1k9r - Thanks for the pointers! For now, I have to finish up a couple other things before the weekend, so I'll just hold off and hope the CH340 adapter is genuine and works out of the box (might also try with CoolTerm).

S199pWa1k9r commented 2 years ago

Maybe you need to use the GPIO14-TX, GPIO15-RX pins to connect to the UART console?

It is written here https://files.pine64.org/doc/quartz64/SOQuartz%20Connector%20Pin%20Assignments%20ver%201.0.ods

geerlingguy commented 2 years ago

Aha! At @hipboi's suggestion, I installed CoolTerm—it works fine with my existing USB-UART adapter at 1.5 Mbps, and now I can see the SOQuartz's output just fine.

I noticed during boot it paused for about 5 seconds on this prompt, then auto-selected #1 ("Buildroot-recovery"):

FIT: No FIT image
Could not find misc partition
ANDROID: reboot reason: "(none)"
optee api revision: 2.0
TEEC: Waring: Could not find security partition
Not AVB images, AVB skip
No valid android hdr
Android image load failed
Android boot failed, error -1.
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:5...
Found /extlinux/extlinux.conf
Retrieving file: /extlinux/extlinux.conf
reading /extlinux/extlinux.conf
812 bytes read in 4 ms (198.2 KiB/s)
Quartz64 Installer
1:.Buildroot-recovery
2:.Debian-Installer
3:.Boot Root SDMMC
4:.Boot Root eMMC
Enter choice: 1:.Buildroot-recovery

So I rebooted, then made sure to press 2, then Enter to get into the Debian installer.

heh... and now the data coming back expects a full terminal with colors and such:

[47m .[31m[!!] Select a language.[30m .(0tqqqqqqqqqqqqqqqqqqqqqqqqk.[6;3Hx.[0m.(B.[30m.[47m                                                                         .(0x.[0m.(B.[1m.[37m.[40m .[7;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m Choose the languagsed for the installation process. The        .(0x.[0m.(B.[1m.[3740m .[8;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m selected language lso be the default language for the installed   .(0x.[0m.(B.[1m.7m.[40m .[9;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m system.                                                                 .(0x.[0m.(B.[1m.[37m.[40m .[10;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m                                                                  .(0x.[0m.(B.[1m.[37m.[40m .[11;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.anguage:                                                            .(0x.[0m.(B.[1m.[37m.[40m .[12;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m                                                                      .(0x.[0m.(B.[1m.[37m.[40m .[13;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m                               C                                         .(0x.[0m.(B.[1m.[37m.[40m .[14;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m                               .[7m.[41mEnglish.[30m.[47m                                   .(0x.m.(B.[1m.[37m.[40m .[15;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m                                                                         .(0x.[0m.(B.[1m.[37m.[40m .[16;3H.[0m.(0.[30m.[47mx.[0m.(B.[30m.[47m     <Go Back>                                                          .(0x.[0m.(B.[1m.[37m.[40m .[17;3H.[0m.(0.[30m.                                         .(0x.[0m.(B.[1m.[37m.[4qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj.[0m.(B.[1m.                                              
.[5B.[44m<Tab> mo.[14;35H.[H.[0m.[7m.[m.[32m.[40m[           .[37m.[31m (.[37m1*installer.[31m).[37m  2 shell  3 shell  4- log           .[32m][.[34m Jan 01 .[37m 0:00 .[32m].[14;35H.[0m
geerlingguy commented 2 years ago

Working through the installer with this formatting is ... fun. But I seem to have gotten most of the way through so far. This was a funny little tidbit:

Screen Shot 2021-12-15 at 12 33 56 PM

iu

Though I'm hitting an error deeper in:

No kernel modules were found. This probably is due to a mismatch between the kernel used by this version of the installer and the kernel version available in the archive.

Going to put this on hold until I can figure out a way to interact via TTY that doesn't make my brain hurt in serial console... is there a way?

timonsku commented 2 years ago

I use Putty for this, gave me color menus and everything

miguemely commented 2 years ago

Curious, since this is such a non-standard baud rate... could using the "Arduino as a TTL" trick still work here?

timonsku commented 2 years ago

I went through the installation btw. it did not yield a working system. Bootloader is missing and need to manually call the init system, have not figured that out yet

geerlingguy commented 2 years ago

@miguemely - Yeah, or a Pico or Pi probably, but as it turns out CoolTerm supports 1.5 Mbps with the cheap USB UART adapter I've been using, so at this point it's a matter of CoolTerm settings so my brain doesn't hurt through the interactive installer :)

miguemely commented 2 years ago

Perfect... Time to pull the 3 SOQuartz I have out of the shelf now.

miguemely commented 2 years ago

@geerlingguy Are you using GPIO15-16, or 14-15?

Edit: Using GPIO15-16, 15 connected to TX, 16 connected to RX of the Arduino, and reset held:

image

Don't I love some gibberish.

martinx72 commented 2 years ago

@geerlingguy

here is mine test with the image you mentioned above

image

Using putty on Windows 11 with FT232B USB to Serial dongle. GPIO14 as TX and GPIO15 as RX IMG_9962

and if you select the '2:.Debian-Installer' it would start the debian installer like this image

miguemely commented 2 years ago

Alright, Arduino = Bad idea here.

https://www.amazon.com/gp/product/B08TX3KTP1/ worked natively on Windows. Same with https://www.amazon.com/gp/product/B00LZV1G6K/

image

Edit: I tried Manjaro's build but it seems stuck somewhere. I let it idle and came back to a call trace.


Retrieving file: /initramfs-linux.img
reading /initramfs-linux.img
8419052 bytes read in 699 ms (11.5 MiB/s)
Retrieving file: /Image
reading /Image
30962176 bytes read in 2559 ms (11.5 MiB/s)
append: initrd=/initramfs-linux.img root=PARTUUID=09a4ee0c-c5a3-42ca-8084-fa9c7daec648 rw earlycon=uart8250,mmio32,0xfe660000 console=tty1 console=ttyS2,1500000n8 quiet splash plymouth.ignore-serial-consoles
Retrieving file: /dtbs/rockchip/rk3566-quartz64-a.dtb
reading /dtbs/rockchip/rk3566-quartz64-a.dtb
101338 bytes read in 17 ms (5.7 MiB/s)
Fdt Ramdisk skip relocation
## Flattened Device Tree blob at 0x0a100000
   Booting using the fdt blob at 0x0a100000
   Using Device Tree in place at 000000000a100000, end 000000000a11bbd9
can't found rockchip,drm-logo, use rockchip,fb-logo
WARNING: could not set reg FDT_ERR_BADOFFSET.
failed to reserve fb-loader-logo memory
Adding bank: 0x00200000 - 0x08400000 (size: 0x08200000)
Adding bank: 0x09400000 - 0xf0000000 (size: 0xe6c00000)
Adding bank: 0x1f0000000 - 0x200000000 (size: 0x10000000)
Total: 3676.925 ms

Starting kernel ...

[    0.301092] arm-scmi firmware:scmi: Failed. SCMI protocol 22 not active.
[   60.392529] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[   60.393098] rcu:     1-...0: (0 ticks this GP) idle=4c7/1/0x4000000000000000 softirq=17/17 fqs=3001
[   60.393904]  (detected by 3, t=6002 jiffies, g=-1147, q=185)
[ 1928.221803] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 1928.221824] watchdog: BUG: soft lockup - CPU#3 stuck for 1635s! [swapper/3:0]
[ 1928.222385] rcu:     1-...0: (0 ticks this GP) idle=4c7/1/0x4000000000000000 softirq=17/17 fqs=8776
[ 1928.223019] Modules linked in:
[ 1928.223804]  (detected by 2, t=192784 jiffies, g=-1147, q=187)
[ 1928.224088] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.16.0-rc4-5-MANJARO-ARM #1
[ 1928.225264] Hardware name: Pine64 RK3566 Quartz64-A Board (DT)
[ 1928.225789] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1928.225792]  stack:    0 pid:    1 ppid:     0 flags:0x0000000a
[ 1928.225802] pc : arch_cpu_idle+0x18/0x2c
[ 1928.226943] lr : arch_cpu_idle+0x14/0x2c
[ 1928.227643] sp : ffff800011fbbde0
[ 1928.227943] x29: ffff800011fbbde0 x28: 0000000000000000 x27: 0000000000000000
[ 1928.228595] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[ 1928.229244] x23: 0000000000000000 x22: ffff0001f0181e00 x21: 0000000000000000
[ 1928.229894] x20: ffff0001f0181e00 x19: 0000000000000000 x18: 0000000000000020
[ 1928.230542] x17: 00000000c3c0b9c4 x16: 000000002ef522c7 x15: ffff0001f026a388
[ 1928.231190] x14: 000000000000018b x13: 0000000000000001 x12: 0000000000000000
[ 1928.231841] x11: 0000000000000000 x10: 0000000000000ab0 x9 : ffff800011fbbd80
[ 1928.232491] x8 : ffff0001f0182910 x7 : ffff8000117f2080 x6 : 00000000fbfa2774
[ 1928.233139] x5 : 00000000000000c0 x4 : 000000000000896e x3 : ffff8001edf6f000
[ 1928.233788] x2 : 0000000000011585 x1 : ffff0001ff761f28 x0 : 00000000000000e0
[ 1928.234439] Call trace:
[ 1928.234666]  arch_cpu_idle+0x18/0x2c
[ 1928.234996]  default_idle_call+0x24/0x6c
[ 1928.235359]  cpuidle_idle_call+0x158/0x1ac
[ 1928.235744]  do_idle+0xa4/0xf4
[ 1928.236030]  cpu_startup_entry+0x24/0x80
[ 1928.236394]  secondary_start_kernel+0xe4/0x110
[ 1928.236805]  __secondary_switched+0x94/0x98```
JustCommitRandomness commented 2 years ago

If you disassemble the used installation image of the SD card into parts and extract the DTB file from it.

dtc -I dtb -O dts rk3566-soquartz-cm4.dtb> rk3566-soquartz-cm4.dts

Convert it to DTS filethen you can see such lines in it:

chosen { stdout-path = "serial2: 1500000n8"; };

so, can the .dts be edited, compiled, and written into the image to set a more conventional speed?

(not that I can plug my terminal into one of these without frying them)

miguemely commented 2 years ago

Ok, after full install and telling to to boot to SDMMC: https://www.toptal.com/developers/hastebin/uyibicurod.yaml

adminy commented 2 years ago

Any updates on getting something like a debian os with dropbear preinstalled so that the rest of us without a uart to usb can do this too?

Also how is the PCI-e support? Anybody tested? SSD's?

pgwipeout commented 2 years ago

Rockchip devices default to 1.5m baud because they cannot generate a clean standard baud below that, due to the clock source and limitations of the divider. @miguemely You were running a quartz64-a image on the SoQuartz which will damage the board.

hipboi commented 2 years ago

You were running a quartz64-a image on the SoQuartz which will damage the board.

Really? Can you explain more? @pgwipeout

pgwipeout commented 2 years ago

The Quartz64 and SoQuartz have different PMICs, different regulator layouts, and different power domains. Loading an incorrect DTS is never a good idea, but specifically with the SoQuartz loading the Quartz64-A DTS leads to a situation where hardware support is half broken and the board gets very hot very quickly.

miguemely commented 2 years ago

Gotcha, ok. Yeah, that could be a problem....

That's odd, I swear I saw the Manjaro build mentioned somewhere in the wiki for the SOQuartz, which is why I tried it...

pgwipeout commented 2 years ago

It should have a SoQuartz DTB in the image, it's just set up to load the Quartz64-A because that's what the image is built for.

miguemely commented 2 years ago

Got it, I'll probably need to re-flash the SD card, grab the SoQuartz and give it another go.

pgwipeout commented 2 years ago

Also, the SoQuartz dtb is specifically for a CM4 carrier module, there shouldn't be much difference with other carrier modules, but the CM4 is the only one I have to test against.

miguemely commented 2 years ago

That's the only carrier module I have as well. I just put this on the back-burner since I finally was able to get a hold of two CM4s, but have 3-4 of the SoQuartz that I would love to start tinkering with again.

pgwipeout commented 2 years ago

You can reach out to me on any of the Pine64 hosted services in the Quartz64 room if you have any questions.

miguemely commented 2 years ago

Alright. So, with that information, and some digging since I'm very new at this, when you flash the manjaro image, you need to modify the extlinux.conf file to use the correct DTB, since its using the quartz64-a DTB by default. It should read something like

-snip-
fdt /dtbs/rockchip/rk3566-soquartz-cm4.dtb
-snip-

Once done, take the SD card out, plop it into a CM4 carrier board, boot it up (if your using putty, make sure you turn off Flow Control), and login as root. It should drop you into the install. Follow the prompts, and you should have a working Manjaro.

Thanks @pgwipeout !

Edit: It seems the latest Manjaro minimal (Manjaro-ARM-minimal-quartz64-bsp-20220207.img.xz) does have the kernel module for the built-in wireless (AzureWave B5), however its erroring out somewhere

[    8.623505] brcmfmac: F1 signature read @0x18000000=0x15264345
[    8.638136] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43455-sdio for chip BCM4345/6
[    8.640278] brcmfmac mmc2:0001:1: Direct firmware load for brcm/brcmfmac43455-sdio.pine64,soquartz-cm4.bin failed with error -2
[    8.641498] brcmfmac mmc2:0001:1: Direct firmware load for brcm/brcmfmac43455-sdio.bin failed with error -2
[    8.643057] usbcore: registered new interface driver brcmfmac
[    9.661639] brcmfmac: brcmf_sdio_htclk: HT Avail timeout (1000000): clkctl 0x50
pgwipeout commented 2 years ago

Yes, it doesn't include the firmware for the brcmfmac43455-sdio, you'll also need a .txt configuration.

dferrg commented 2 years ago

Has anyone managed to make PCIe work for the soquartz? I'm testing a board I designed and everything works fine except for the PCIe, which isn't detecting the attached card.

pgwipeout commented 2 years ago

Yes @dferrg. What board are you using?

dferrg commented 2 years ago

It's my own design, based on CM4_NAS board by mebs. I connected RX, TX and CLK_REF differential pairs, with length matching for every pair, and nRESET and CLK_nREQ lines.

As it looks like the problem is with my board, and unrelated to this thread, I'll let a link to an issue in my repo just in case anyone wants to take a look or give me any hint.

Edit: if you meant what soquartz board i'm using, it's a v1.1 4gb model booting from SD (no eMMC).

jcdutton commented 2 years ago

Has anyone found device tree .dts files for this board? There are multiple parts to the device tree.

1) The Pine64 SOQuartz module itself. 2) The motherboard the Pine64 SOQuartz is plugged into. (this be will vary if one uses different boards with the module) 3) Optional device tree overlays - so one can switch features on/off

Once one has the correct device tree files, I would expect it to work ok with almost any arm64 linux kernel. If you are playing with .dtb files, you are fighting a loosing battle.

pgwipeout commented 2 years ago

https://gitlab.com/pgwipeout/linux-next/-/blob/main/arch/arm64/boot/dts/rockchip/rk3566-soquartz.dtsi < Module DTSI https://gitlab.com/pgwipeout/linux-next/-/blob/main/arch/arm64/boot/dts/rockchip/rk3566-soquartz-cm4.dts < CM4-IO Carrier Board DTS

And no, it won't work nicely with any kernel other than mine or ones based on mine ;) It's an adventure getting everything mainlined.

pgwipeout commented 2 years ago

IRT the rk3566 PCIe controller, it seems we have an issue with cache snooping chip wide. Despite having 1GB of addressable space, dGPUs don't like this. With the radeon based cards, I succeeded at getting a generic console, but RING 0 test fails.

Coreforge commented 2 years ago

I don't know it off of the top of my head anymore, but the cache issue should be fairly simple to get around, as the bcm2711 had the same issues (plus some other ones).

bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC);
bo->flags |= RADEON_GEM_GTT_UC;

Adding these two lines here might be enough, if I haven't forgotten anything. The first line might not be needed if write combining works on the RK3566.

pgwipeout commented 2 years ago

Sadly it makes no difference, either ring test 0 fails, or we faceplant hard with bo->flags |= RADEON_GEM_GTT_UC;

[    9.056553] radeon 0000:01:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used)
[    9.060744] radeon 0000:01:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF
[    9.066184] [drm] Detected VRAM RAM=1024M, BAR=256M
[    9.070998] [drm] RAM width 128bits DDR
[    9.078439] [drm] radeon: 1024M of VRAM memory ready
[    9.081685] [drm] radeon: 1024M of GTT memory ready.
[    9.086887] [drm] Loading TURKS Microcode
[    9.137734] [drm] Internal thermal controller with fan control
[    9.169945] [drm] radeon: dpm initialized
[    9.179102] [drm] GART: num cpu pages 262144, num gpu pages 262144
[    9.191661] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[    9.313054] [drm] PCIE GART of 1024M enabled (table at 0x0000000000162000).
[    9.313454] Unable to handle kernel paging request at virtual address ffffffc00ab13000
[    9.313469] Mem abort info:
[    9.313470]   ESR = 0x96000061
[    9.313474]   EC = 0x25: DABT (current EL), IL = 32 bits
[    9.313478]   SET = 0, FnV = 0
[    9.313480]   EA = 0, S1PTW = 0
[    9.313482]   FSC = 0x21: alignment fault
[    9.313485] Data abort info:
[    9.313486]   ISV = 0, ISS = 0x00000061
[    9.313488]   CM = 0, WnR = 1
[    9.313491] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000003c5b000
[    9.313494] [ffffffc00ab13000] pgd=10000001fffff003, p4d=10000001fffff003, pud=10000001fffff003, pmd=1000000100aa9003, pte=006800010b6bf70f
[    9.313513] Internal error: Oops: 96000061 [#1] PREEMPT SMP
[    9.313519] Modules linked in: crct10dif_ce hci_uart btrtl btbcm radeon(+) ftdi_sio drm_ttm_helper usbserial ttm fuse ip_tables x_tables ipv6
[    9.313552] CPU: 1 PID: 267 Comm: systemd-udevd Tainted: G        W         5.17.0-rc5-00097-gccb1df4cf6b5-dirty #217
[    9.313559] Hardware name: Pine64 RK3566 Quartz64-A Board (DT)
[    9.313563] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    9.313568] pc : __memset+0x16c/0x188
[    9.313583] lr : radeon_wb_init+0x1b8/0x2a0 [radeon]
[    9.313776] sp : ffffffc00b53b6f0
[    9.313779] x29: ffffffc00b53b700 x28: 0000000000000013 x27: 0000000000000100
[    9.313789] x26: ffffffc00a34a000 x25: 0068000000000f13 x24: 0000000000008c00
[    9.313798] x23: 0000000000008b10 x22: 0000000000008a60 x21: 0000000000028200
[    9.313806] x20: ffffff8108754000 x19: 0000000000000000 x18: 0000000000009a3c
[    9.313814] x17: 0000000000009a38 x16: 0000000000009a34 x15: 0000000000009a30
[    9.313821] x14: ffffff81008a2dd1 x13: ffffff810264cd50 x12: ffffffc00ab27000
[    9.313830] x11: ffffff81fffff2b0 x10: ffffffc04ab13000 x9 : 0000000000000000
[    9.313837] x8 : ffffffc00ab13000 x7 : 0000000000000000 x6 : 000000000000003f
[    9.313844] x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000000000004
[    9.313852] x2 : 0000000000000fc0 x1 : 0000000000000000 x0 : ffffffc00ab13000
[    9.313860] Call trace:
[    9.313862]  __memset+0x16c/0x188
[    9.313872]  evergreen_startup.part.0+0xef4/0x2554 [radeon]
[    9.314002]  evergreen_init+0x2a4/0x390 [radeon]
[    9.314130]  radeon_device_init+0x4d0/0xa40 [radeon]
[    9.314256]  radeon_driver_load_kms+0x94/0x1ac [radeon]
[    9.314383]  drm_dev_register+0xec/0x220
[    9.314392]  radeon_pci_probe+0xf8/0x17c [radeon]
[    9.314519]  local_pci_probe+0x4c/0xc0
[    9.314529]  pci_device_probe+0x1b0/0x1f0
[    9.314536]  really_probe.part.0+0xa4/0x310
[    9.314545]  __driver_probe_device+0xa0/0x150
[    9.314551]  driver_probe_device+0x4c/0x160
[    9.314556]  __driver_attach+0x104/0x1a0
[    9.314560]  bus_for_each_dev+0x7c/0xe0
[    9.314565]  driver_attach+0x30/0x40
[    9.314570]  bus_add_driver+0x15c/0x200
[    9.314575]  driver_register+0x84/0x140
[    9.314579]  __pci_register_driver+0x50/0x5c
[    9.314585]  radeon_module_init+0x78/0x1000 [radeon]
[    9.314712]  do_one_initcall+0x50/0x2a0
[    9.314720]  do_init_module+0x54/0x270
[    9.314727]  load_module+0x2050/0x28b0
[    9.314731]  __do_sys_finit_module+0xbc/0x110
[    9.314736]  __arm64_sys_finit_module+0x2c/0x40
[    9.314740]  invoke_syscall.constprop.0+0x58/0xf0
[    9.314747]  do_el0_svc+0x128/0x160
[    9.314752]  el0_svc+0x28/0xe0
[    9.314760]  el0t_64_sync_handler+0x1a8/0x1b0
[    9.314765]  el0t_64_sync+0x1a0/0x1a4
[    9.314774] Code: 91010108 54ffff4a 8b040108 cb050042 (d50b7428)
[    9.314780] ---[ end trace 0000000000000000 ]---
jcdutton commented 2 years ago

@pgwipeout

And no, it won't work nicely with any kernel other than mine or ones based on mine ;) It's an adventure getting everything mainlined. True, the pcie source code is not in mainline yet. I can help with getting code mainlined into the Linux kernel if you like. If your current code works at least well enough to support all the PCIe cards that already work with the PI CM4, that should be good enough to get into the review stage.

pgwipeout commented 2 years ago

Thanks, but I don't need help with it. I just needed the phy driver to get out of review hell. It was accepted for 5.18 today, so we should be good for the next release cycle.

Coreforge commented 2 years ago

Looks like you also need to replace this memset with memset_io. There might be some other changes too that might be needed to get it fully working, but my code is a pretty big mess, so it's hard to know what's actually needed and what isn't. I'd also recommend disabling the uvd using the module parameter, as that just times out after not being able to start, wasting time, and it's not needed, unless you're trying to decode video.

jcdutton commented 2 years ago

@pgwipeout That oops is just an unaligned write. A relatively easy bug to fix.

pgwipeout commented 2 years ago

We suspect cache snooping is completely broken on rk356x. This is indicated by the ITS requiring some rather unsightly hacks to work. https://gitlab.com/pgwipeout/linux-next/-/commit/1443e24c848de7da5431ce6e62d1d6962b7de18a Rockchip has not provided any insight here unfortunately. DMA to ram doesn't work above 4GB, due to the rk356x having a 32bit interconnect. Here is the standard failure:

[    9.160745] ATOM BIOS: TURKS
[    9.164489] [drm] GPU not posted. posting now...
[    9.172809] radeon 0000:01:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used)
[    9.175683] radeon 0000:01:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF
[    9.178484] [drm] Detected VRAM RAM=1024M, BAR=256M
[    9.180912] [drm] RAM width 128bits DDR
[    9.183475] [drm] radeon: 1024M of VRAM memory ready
[    9.185953] [drm] radeon: 1024M of GTT memory ready.
[    9.188530] [drm] Loading TURKS Microcode
[    9.250828] [drm] Internal thermal controller with fan control
[    9.270952] [drm] radeon: dpm initialized
[    9.287221] [drm] GART: num cpu pages 262144, num gpu pages 262144
[    9.292097] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[    9.428160] [drm] PCIE GART of 1024M enabled (table at 0x0000000000162000).
[    9.428557] radeon 0000:01:00.0: WB enabled
[    9.428577] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00
[    9.428587] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c
[    9.435434] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118
[    9.439887] radeon 0000:01:00.0: radeon: MSI limited to 32-bit
[    9.440248] radeon 0000:01:00.0: radeon: using MSI.
[    9.440372] [drm] radeon: irq initialized.
[    9.441440] radeon 0000:01:00.0: enabling bus mastering
[    9.794807] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD)
[    9.798804] radeon 0000:01:00.0: disabling GPU acceleration
[    9.808046] [drm] Radeon Display Connectors
[    9.810315] [drm] Connector 0:
[    9.812323] [drm]   DP-1
[    9.814265] [drm]   HPD4
[    9.816142] [drm]   DDC: 0x6450 0x6450 0x6454 0x6454 0x6458 0x6458 0x645c 0x645c
[    9.818552] [drm]   Encoders:
[    9.820572] [drm]     DFP1: INTERNAL_UNIPHY2
[    9.822807] [drm] Connector 1:
[    9.824909] [drm]   DVI-I-1
[    9.827083] [drm]   HPD1
[    9.829133] [drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c
[    9.831960] [drm]   Encoders:
[    9.833979] [drm]     DFP2: INTERNAL_UNIPHY
[    9.836065] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[    9.898000] radeon 0000:01:00.0: [drm] Cannot find any crtc or sizes
[    9.903586] [drm] Initialized radeon 2.50.0 20080528 for 0000:01:00.0 on minor 2
[   10.962008] radeon 0000:01:00.0: [drm] Cannot find any crtc or sizes
pgwipeout commented 2 years ago

Also note, Nvidia cards also fail. Nouveau and Nvidia's closed source driver. I did get a basic framebuffer on an Nvidia card by hacking an x86 rombar emulator into a VM and passing the PCIe controller through with some more fun hacks, but it would crash the second anything real touched it.

Coreforge commented 2 years ago

DMA to RAM over PCIe? As long as PCIe devices can access the RAM, it should work. If they can't it'll definitely be a bit more (or a lot more) complicated.

pgwipeout commented 2 years ago

Ugh, it's the DMA issue. I've forced the ram to be in the DMA range (less than 4G) and forced uncached and ring test succeeds.

Coreforge commented 2 years ago

It might be possible to only put part of the ram into the DMA range and make sure GTT buffers are in that range, but that would be a bit more complicated again. Only 1GB or less of overlap should be enough.

pgwipeout commented 2 years ago

It's actually really quite simple. Same way to fix the ITS. I'll probably need to make all kernel allocations be in the DMA range to be safe on this board. It also means the issue is uniquely a pain in the rear on my board with 8G of ram, where 4G and 2G boards will only need to force uncached.

Now to figure out why it's not picking up any crtcs.

diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 4f0fbf667431..ad8fa1692033 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -490,7 +490,7 @@ int radeon_wb_init(struct radeon_device *rdev)
    }

    /* clear wb memory */
-   memset((char *)rdev->wb.wb, 0, RADEON_GPU_PAGE_SIZE);
+   memset_io((char *)rdev->wb.wb, 0, RADEON_GPU_PAGE_SIZE);
    /* disable event_write fences */
    rdev->wb.use_event = false;
    /* disabled via module param */
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 56ede9d63b12..3153d04bca58 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -220,6 +220,9 @@ int radeon_bo_create(struct radeon_device *rdev,
        bo->flags &= ~RADEON_GEM_GTT_WC;
 #endif

+// bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC);
+   bo->flags |= RADEON_GEM_GTT_UC;
+
    radeon_ttm_placement_from_domain(bo, domain);
    /* Kernel allocation are uninterruptible */
    down_read(&rdev->pm.mclk_lock);
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 11b21d605584..e286f92e7d2d 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -507,7 +507,7 @@ static struct ttm_tt *radeon_ttm_tt_create(struct ttm_buffer_object *bo,
 #endif
    rbo = container_of(bo, struct radeon_bo, tbo);

-   gtt = kzalloc(sizeof(struct radeon_ttm_tt), GFP_KERNEL);
+   gtt = kzalloc(sizeof(struct radeon_ttm_tt), GFP_KERNEL | GFP_DMA);
    if (gtt == NULL) {
        return NULL;
    }
Coreforge commented 2 years ago

I assume you have a monitor connected?

pgwipeout commented 2 years ago

Funny, the displayport -> hdmi adaptor died.

Coreforge commented 2 years ago

I guess that explains that issue.