Seeed-Studio / seeed-linux-dtoverlays

Device Tree Overlays for Seeed boards
Other
82 stars 54 forks source link

Routine upgrade to Raspberry Pi OS Bullseye 6.x kernel breaks lan7800 #59

Closed Paraphraser closed 1 year ago

Paraphraser commented 1 year ago

Describe the bug

I have just done a routine:

$ sudo apt update
$ sudo apt upgrade -y

and the result is a broken reRouter CM4 1432.

Kernel change

uname -a output:

lan7800 driver fails after kernel upgrade

$ grep lan7800 /var/log/syslog
Mar 21 09:07:13 sys-rtr systemd-modules-load[152]: Failed to find module 'lan7800'

This has happened several times since I acquired the reRouter CM4 1432 so I just follow the instructions at Ethernet Ports Configuration/

rebuilding lan7800 fails after kernel upgrade

$ git clone https://github.com/Seeed-Studio/seeed-linux-dtoverlays.git
Cloning into 'seeed-linux-dtoverlays'...
remote: Enumerating objects: 2928, done.
remote: Counting objects: 100% (498/498), done.
remote: Compressing objects: 100% (200/200), done.
remote: Total 2928 (delta 368), reused 398 (delta 298), pack-reused 2430
Receiving objects: 100% (2928/2928), 3.43 MiB | 8.27 MiB/s, done.
Resolving deltas: 100% (1568/1568), done.

$ cd seeed-linux-dtoverlays

$ sudo ./scripts/cm4_lan7800.sh
Installed: /usr/src/linux-headers-6.1.19-v8+
make: Entering directory '/usr/src/linux-headers-6.1.19-v8+'
  CC [M]  /home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.o
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c: In function ‘lan78xx_init_mac_address’:
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:1955:26: warning: passing argument 1 of ‘ether_addr_copy’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
 1955 |  ether_addr_copy(dev->net->dev_addr, addr);
      |                  ~~~~~~~~^~~~~~~~~~
In file included from /home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:8:
./include/linux/etherdevice.h:295:40: note: expected ‘u8 *’ {aka ‘unsigned char *’} but argument is of type ‘const unsigned char *’
  295 | static inline void ether_addr_copy(u8 *dst, const u8 *src)
      |                                    ~~~~^~~
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c: In function ‘lan78xx_set_mac_addr’:
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:2550:24: warning: passing argument 1 of ‘ether_addr_copy’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
 2550 |  ether_addr_copy(netdev->dev_addr, addr->sa_data);
      |                  ~~~~~~^~~~~~~~~~
In file included from /home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:8:
./include/linux/etherdevice.h:295:40: note: expected ‘u8 *’ {aka ‘unsigned char *’} but argument is of type ‘const unsigned char *’
  295 | static inline void ether_addr_copy(u8 *dst, const u8 *src)
      |                                    ~~~~^~~
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c: In function ‘lan78xx_probe’:
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:4312:2: error: implicit declaration of function ‘netif_set_gso_max_size’; did you mean ‘netif_set_tso_max_size’? [-Werror=implicit-function-declaration]
 4312 |  netif_set_gso_max_size(netdev, LAN78XX_TSO_SIZE(dev));
      |  ^~~~~~~~~~~~~~~~~~~~~~
      |  netif_set_tso_max_size
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:4314:2: error: too many arguments to function ‘netif_napi_add’
 4314 |  netif_napi_add(netdev, &dev->napi, lan78xx_poll, LAN78XX_NAPI_WEIGHT);
      |  ^~~~~~~~~~~~~~
In file included from /home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:7:
./include/linux/netdevice.h:2569:1: note: declared here
 2569 | netif_napi_add(struct net_device *dev, struct napi_struct *napi,
      | ^~~~~~~~~~~~~~
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:4361:9: error: too many arguments to function ‘usb_maxpacket’
 4361 |  maxp = usb_maxpacket(dev->udev, dev->pipe_intr, 0);
      |         ^~~~~~~~~~~~~
In file included from /home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:10:
./include/linux/usb.h:1981:19: note: declared here
 1981 | static inline u16 usb_maxpacket(struct usb_device *udev, int pipe)
      |                   ^~~~~~~~~~~~~
/home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:4377:19: error: too many arguments to function ‘usb_maxpacket’
 4377 |  dev->maxpacket = usb_maxpacket(dev->udev, dev->pipe_out, 1);
      |                   ^~~~~~~~~~~~~
In file included from /home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.c:10:
./include/linux/usb.h:1981:19: note: declared here
 1981 | static inline u16 usb_maxpacket(struct usb_device *udev, int pipe)
      |                   ^~~~~~~~~~~~~
cc1: some warnings being treated as errors
make[1]: *** [scripts/Makefile.build:250: /home/pi/seeed-linux-dtoverlays/modules/lan7800/lan78xx.o] Error 1
make: *** [Makefile:2012: /home/pi/seeed-linux-dtoverlays/modules/lan7800] Error 2
make: Leaving directory '/usr/src/linux-headers-6.1.19-v8+'
Build failed: lan7800

I went through this loop of finding and fixing a bug in cm4_lan7800.sh once before - #52 - but this one is beyond me. The result is I no longer have a working driver for the lan7800.

install_rpi fails

I also tried the instructions at Step 2: Install *.dtbo to see if they fared any better:

$ sudo make install_rpi
cp: cannot stat '2xMCP2517FD-overlay.dtbo': No such file or directory
make[2]: *** [Makefile:46: install] Error 1
make[1]: *** [Makefile:227: install_arch] Error 2
make: *** [Makefile:164: install_rpi] Error 2

If anything, I'd call that worse.

And, just for clarity, I retried sudo make install_rpi on a clean clone of the repo just to make sure the earlier attempt at running sudo ./scripts/cm4_lan7800.sh had not left a mess behind.

Discussion

The "obvious" solution is to downgrade Raspberry Pi OS to the previous kernel where I know lan7800 will still compile. Googling that topic arrives at the (unfortunate) answer that it isn't supported and a re-installation is in your future.

My original intention in purchasing the Seeed reRouter CM4 1432 was to go through the process of building my own router on top of Raspberry Pi OS. The notion of two Ethernet ports seemed like a good idea at the time and I got a fair way down the track, just short of putting it into production.

It is problems of this kind (what should be routine OS maintenance tasks causing breakages) that have given me pause.

If I only ever needed to recompile the driver, I could handle that. After all, the second Ethernet port still "works" without the driver, it just has asymmetric forwarding performance.

But rebuilding? It may just be me but any sentence including the word "rebuild" is never going to be high on my list of desirable features for a home router, particularly when a rebuild needs to begin with taking the router offline and opening the case.

I'm rapidly reaching the conclusion that the Seeed reRouter CM4 1432 was a nice idea in theory, shame about the practice.

Yes, I could switch to OpenWRT but that defeats my original purpose. The product claimed to run Raspberry Pi OS. It does, but only up to a point. The support for the additional features like the second Ethernet port seems to be a bit … lacking.

And, yes, I could just ignore the second Ethernet port and do everything with VLANs but then I'd be asking myself why I didn't just go with a standard Pi4 in the first place?

I realise that the last few paragraphs are editorialising and don't help solve the problem at hand. I've included them for two reasons:

  1. Anyone currently using the Seeed reRouter CM4 1432 + Raspberry Pi OS will be cautious before attempting a routine OS upgrade; and
  2. Anyone considering using the Seeed reRouter CM4 1432 + Raspberry Pi OS where they expect the second Ethernet port to "just work" goes into the challenge with their eyes open.

To Reproduce

Included above in commands.

Expected behavior

I expected sudo ./scripts/cm4_lan7800.sh to compile and install the driver, not chuck up errors.

Screenshots

Terminal output copy/paste in the above.

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context

Included above (eg sudo make install_rpi)

bwshockley commented 1 year ago

I've run into this same issue - it would be nice to get the script updated for the newer kernel.

Paraphraser commented 1 year ago

tl;dr

I don't think this is a problem any more:

  1. The lan78xx driver built into Linux is perfectly adequate.
  2. This is true for at least kernel 6.1.21-v8+ #1642.
  3. As far as the Seeed Mini-Router is concerned, I don't believe there is any need to clone this repo or run ./scripts/cm4_lan7800.sh.

details

Way back when I first purchased the mini router, I didn't do more than skim-read the docs (yes, my bad) so I didn't notice the instructions about cloning this repo and building the driver etc.

I set up a test network and started running iperf tests. I observed:

After some mucking about:

  1. I finally read the docs carefully (🤦🏼‍♂️ - RTFMS - 🤦🏼‍♂️)
  2. Cloned this repo and ran the cm4_lan7800.sh script
  3. Found and fixed an error - see #52
  4. Built the driver
  5. Re-ran the forwarding tests.

I saw reasonable, symmetric, forwarding performance, on par with eth0. I pronounced everything good and moved on.

It was slightly annoying to have to rebuild the driver each time the kernel updated but not really a big deal.

Until this issue came along.

Another kernel update came my way in the last few days. The inability to compile the driver was still present so I decided to drill further.

The Seeed doco says:

quote from Seeed mini-router documentation

but that seemed to be at odds with what I was seeing on torvalds/linux/tree/master/drivers/net/usb which had five changes in 2022 and two so far this year.

On a hunch, I replaced lan78xx.h and lan78xx.c from this repo with the files of the same name from the torvalds/linux repo. The driver compiled first time and installed but dmesg | grep lan complained that the sanctity of the kernel was under threat (my words - I didn't make a note of the actual message). I uninstalled the driver I had just compiled.

I was considering doing a compare/contrast of the Seeed and torvalds/linux versions with a view to trying to figure out the origin of the claim that "the left-side port will provide … a much reduced speed" but it occurred to me to first verify that statement by re-running my iperf tests:

test network

Forwarding performance was measured with iperf3. The blue and green flows indicate the transfer speeds I observed. I see nothing to complain about.

My guess is the earlier problem was fixed by:

That change is dated March 3, 2023.

It would be really useful if someone else could run their own tests and validate my findings. If it turns out that the kernel-supplied driver works, generally, then we can probably set about updating the Seeed doco.