siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
5.75k stars 466 forks source link

Support SBC Pine64 SOQuartz #7112

Open Pythoner6 opened 1 year ago

Pythoner6 commented 1 year ago

Feature Request

Description

I'm really quite interested in trying out running talos for my small k8s cluster, but I'm currently my available hardware is a few SOQuartz modules (which I'm running on the SOQuartz blade breakout boards) https://wiki.pine64.org/wiki/SOQuartz.

frezbo commented 1 year ago

We don't have the hardware to test, if you have the hardware you could start looking at the existing pine64 code as headstart. The basic requirement is to have upstream u-boot support for the SOQuartz

Pythoner6 commented 1 year ago

Ah, ok. Yeah I don't think there's upstream uboot for the soquartz at the moment (or even generally for the rk3566 chip it uses in general). Later this year I should be getting some raspberry pis so in all likelihood I'll probably just end up waiting for that.

g5pw commented 1 year ago

Hey all! So, it appears we have support for rk3566 in mainline now, not sure if we have full support for soquartz or not, but I'd like to start porting the needed code to get a soquartz to boot :) Are there any guidelines/porting guides for new hardware?

frezbo commented 1 year ago

Hey all! So, it appears we have support for rk3566 in mainline now, not sure if we have full support for soquartz or not, but I'd like to start porting the needed code to get a soquartz to boot :) Are there any guidelines/porting guides for new hardware?

by mainline if you meant LTS, then that should be good (Talos uses LTS kernel)

Basically you'd need u-boot support for the board and upstream kernel to provide dtbs.

pl4nty commented 11 months ago

looks like it's supported in LTS kernel v6.1.36, but patches haven't been merged to u-boot yet

pl4nty commented 11 months ago

I've written a u-boot pkg with SOQuartz CM4 support, but it needs CONFIG_TOOLS_LIBCRYPTO=y to fix build errors. I'm not sure how to design for multiple flavours either (CM4, Blade, Model A) with vendored DDR/ATF binaries. But my Talos port builds, so I'll test it tomorrow after I find a power supply

pl4nty commented 11 months ago

Got lucky with a power supply, and the SOQuartz boots. DHCPv4 fails but v6 seems fine, could be my network but I don't have any other Talos-compatible devices to test

Download image here

g5pw commented 11 months ago

I have three boards in a turingpi2, any tests you want me to run?

pl4nty commented 11 months ago

@g5pw thanks! it'd be great if you can flash a board with that image and try adding it to a cluster (or bootstrapping a new one). mine's in a Turing Pi 2 as well but I haven't tested peripherals, only ethernet and BMC

g5pw commented 11 months ago

I finally got some time to flash the image an try to boot it. The boot completes fine, and I see UART works OK. However, the boot then hangs trying to get ethernet connectivity. It looks like the system is trying to acquire an address via DHCP, but gets no response. I'm unable to get a shell to do further testing, but I could to a dump on the router to see if the packet gets through or not.

pl4nty commented 11 months ago

Sounds similar to the issue I saw, can you try getting a dump from your router? Mine doesn't support much debugging unfortunately. I'll try building an image with additional logging too

g5pw commented 11 months ago

I fiddled a bit with a capture on my router, however I see no packets going out :/

Here are some relevant logs, apparently a link is detected on the hw interface:

[   31.737348] [talos] phase meta (6/11): 1 tasks(s)
[   31.742831] [talos] error querying ethtool link state {"component": "controller-runtime", "controller": "network.LinkStatusController", "link": "enxa24b0axxxxxx", "error": "netlink receive: device or resource busy"}
[   31.764782] [talos] task reloadMeta (1/1): starting
[   31.771567] rk_gmac-dwmac fe010000.ethernet enxa24b0axxxxxx: Register MEM_TYPE_PAGE_POOL RxQ-0
[   31.784325] rk_gmac-dwmac fe010000.ethernet enxa24b0axxxxxx: PHY [stmmac-0:00] driver [Generic PHY] (irq=POLL)
Safety Features support found
[   31.821415] rk_gmac-dwmac fe010000.ethernet enxa24b0axxxxxx: IEEE 1588-2008 Advanced Timestamp[   31.832572] rk_gmac-dwmac fe010000.ethernet enxa24b0axxxxxx: registered PTP clock
[   31.836599] [talos] META: loaded 0 keys[   31.845719] rk_gmac-dwmac fe010000.ethernet enxa24b0axxxxxx: configuring for phy/rgmii link mode
[   31.846823] [talos] task[   31.862364] [talos] phase meta (6/11): done, 124.993167ms
[   31.863925] rk_gmac-dwmac fe010000.ethernet enxa24b0axxxxxx: LiDRCONF(NETDEV_CHANGE): enxa24b0axxxxxx: link becomes ready
pl4nty commented 11 months ago

The patches were accepted and released in u-boot 2023.10-rc2, so I rebased and built a new image. But I'm still seeing similar logs to you, along with this:

[ 1268.815670] [talos] request/renew failed {"component": "controller-runtime", "controller": "network.OperatorSpecController", "operator": "dhcp4", "error": "unable to receive an offer: got an error while the discovery request: no matching response packet received", "link": "enx1a9c0dafc579"}
pl4nty commented 11 months ago

I rechecked the Quartz64 dev page, patches for ethernet GMAC and Motorcomm PHY have been submitted but not accepted. I've built a new image with them but still get the same logs. Also tried IPv6 DNS/NTP via kernel params with no success

g5pw commented 11 months ago

I'll give it another go, but it would be nice to have some kind of shell to poke around, have you tried booting with a shell as /bin/init or in systems rescue mode?

pl4nty commented 11 months ago

I wouldn't know where to start trying to get a shell, it doesn't ship with one. /dev/sdaX is missing from the BMC too so I'm not sure how to get filesystem access

pl4nty commented 10 months ago

I've spent quite a few hours on this with no progress. Would be great if someone can test with different hardware (ie not Turing Pi 2). I don't see a dhcp4 DHCP ACK after the REQUEST log, and v6 isn't reachable although it briefly displays this machine is reachable at: [address]

pl4nty commented 7 months ago

turns out it just needed some kernel defconfigs... here's the new and working image. I'll submit PRs once these dependencies are ready:

Christos822 commented 6 months ago

Hi, is there any image for rock pi cm3?

github-actions[bot] commented 57 minutes ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.