Closed carlosedp closed 6 years ago
Confirming that the RockPro64 also shares the same device ID from the other boards:
$ ls /sys/class/net/eth0/device/ -l
total 0
lrwxrwxrwx 1 root root 0 Jul 26 09:23 driver -> ../../../bus/platform/drivers/rk_gmac-dwmac
-rw-r--r-- 1 root root 4096 Jul 26 09:23 driver_override
drwxr-xr-x 3 root root 0 Jul 26 09:23 mdio_bus
-r--r--r-- 1 root root 4096 Jul 26 09:23 modalias
drwxr-xr-x 3 root root 0 Jul 26 09:23 net
lrwxrwxrwx 1 root root 0 Jul 26 09:23 of_node -> ../../../firmware/devicetree/base/ethernet@fe300000
drwxr-xr-x 2 root root 0 Jul 26 09:23 power
lrwxrwxrwx 1 root root 0 Jul 26 09:23 subsystem -> ../../../bus/platform
-rw-r--r-- 1 root root 4096 Jul 26 09:23 uevent
It's just a matter of creating a script /etc/network/if-up.d/disable_offload
with:
#!/bin/sh
ETHTOOL=/sbin/ethtool
test -x $ETHTOOL || exit 0
[ -d "/sys/devices/platform/fe300000.ethernet/net/$IFACE" ] || [ -d "/sys/devices/platform/ff540000.eth/net/$IFACE" ] || [ -d "/sys/devices/platform/ff540000.ethernet/net/$IFACE" ] || exit 0
$ETHTOOL -K "$IFACE" rx off tx off
And chmod +x /etc/network/if-up.d/disable_offload
@carlosedp
Great work π
Ok, I updated our Rock64 image based on ayufan 0.7.9 yesterday. Can confirm the entry already exists: https://dietpi.com/downloads/images/DietPi_Rock64-ARMv8-Stretch.7z
root@DietPi:~# cat /etc/network/if-up.d/rock64-offload
#!/bin/sh
ETHTOOL=/sbin/ethtool
test -x $ETHTOOL || exit 0
[ -d "/sys/devices/platform/ff540000.eth/net/$IFACE" ] || [ -d "/sys/devices/platform/ff540000.ethernet/net/$IFACE" ] || exit 0
$ETHTOOL -K "$IFACE" rx off tx off
In terms of adding this to patch during update, we need to be careful not to duplicate the entry, or conflict with a .deb
update that Ayufan may apply in the future.
For the sake of stability due to the above, I believe we should leave the installed setting alone (if it exists), to avoid potential conflict?
In terms of applying for the RockPro64, I believe I have a failed board, waiting for reply from Pine. Until then, unable to create the image. Regardless, we will use the Ayufan images aswell. https://github.com/Fourdee/DietPi/issues/1812#issuecomment-412069528
This does not apply only to RockPro64 but to all RK3399 boards I tested.
I think the correct would be creating a second file (not the "rock64-offload" that ayufan uses) that overrides the configuration to all boards that needs this. The setting to Rock64 could stay there since the ethtool command would be ran twice.
@carlosedp
This does not apply only to RockPro64 but to all RK3399 boards I tested.
Good point π
Ok, lets see if we can patch.
The setting to Rock64 could stay there since the ethtool command would be ran twice.
Not efficient. I believe it would be better if we check for pre-existing setting, then only add if it does not exist with?
cat /etc/network/if-up.d/* | grep -m1 'rx off tx off'
# result = $ETHTOOL -K "$IFACE" rx off tx off
If you check like this, it will match on the line for example for Rock64 board and not apply the command for the RK3399. I don't think Ayufan's file contains the ID for RK3399.
I don't see a big problem on running the command twice in the worst case.
BTW, I commented on a similar issue on Ayufan's repo to point this out. https://github.com/ayufan-rock64/linux-build/issues/263
@carlosedp
I don't see a big problem on running the command twice in the worst case.
Yep agree, although, best to avoid it if possible.
Ok, we could probably check and remove all file entries for cat /etc/network/if-up.d/* | grep -m1 'rx off tx off'
, then recreate our own.
I'll send a commit in a few.
Just to add, this behavior doesn't apply only to RockPro64 that Ayufan build. This also happens to the Firefly RK3399 and the FriendlyARM RK3399 SBCs. I believe it's a generic "problem" with Rockchip processors and 1Gbps ethernet.
Testing Rock64:
@carlosedp Is this correct?
Previously (0.7.9 with ayufan's conf installed):
root@DietPi:~# ethtool --show-offload eth0 | grep offload
tcp-segmentation-offload: off
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: off [fixed]
l2-fwd-offload: off [fixed]
After applying your patch:
root@DietPi:~# ethtool --show-offload eth0 | grep offload
tcp-segmentation-offload: off
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: off [fixed]
l2-fwd-offload: off [fixed]
Check the first parameters (checksumming ones):
$ sudo ethtool --show-offload eth0
Features for eth0:
rx-checksumming: off
tx-checksumming: off
tx-checksum-ipv4: off
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off [fixed]
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
@carlosedp
Appears to be working π
root@DietPi:~# ethtool --show-offload eth0 | grep check
rx-checksumming: off
tx-checksumming: off
tx-checksum-ipv4: off
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
You can test on any device with patch:
dietpi-backup 1
G_DEV_1
dietpi-backup -1
Looks like I'm having this issue with large transfers on a NanoPi M4. Made a post about it here: https://unix.stackexchange.com/questions/494290/apache2-ssl-large-download-hangs.
@TCB13 Jep, NanoPi M4 also has RK3399 chip, so should be affected as well. However we shipped a fix some DietPi versions ago: https://github.com/Fourdee/DietPi/blob/master/rootfs/etc/network/if-up.d/dietpi-disable_offload
So if you are on current version (6.19.7), you issue should have a different cause. EDIT: Ah just see your post is some years old and you use(d) an ARMbian image, right?
@carlosedp @Fourdee Are you aware of any downside by disabling offloading, which is enabled by default? Perhaps we should check by times, if there is a fix applied on e.g. firmware level, or we apply the fix on RK3399 devices only. Currently it is theoretically applied to all devices, although maybe not all might have matching sys file names, the are checked.
@MichaIng my post is a few days old, not years π₯ and yes I'm using the latest armbian for the device. I'll test the proposed fix above and see if it also applies. From my experience disabling offloading usually means more CPU load, however I'm not sure how that plays out on a ARM device.
Update: running ethtool -K eth0 rx off tx off
on my Nano Pi 4 (Armbian) fixed the issue as well. My device shows as /sys/devices/platform/fe300000.ethernet/net/eth0/
. I didn't notice any CPU hit.
@TCB13
my post is a few days old, not years
Ah sorry mixed day and year, which is great since M4 didn't exist in 2013 π. I should not work that late at night...
Thanks for verifying. So this is covered by our script as well:
/sys/devices/platform/fe300000.ethernet/net/$IFACE
I think only tx-checksum-ipv6 off
makes the difference, but that may be limited to my tests.
Doing rx off tx off
produces a bigger diff:
$ diff -rup before.txt after.txt
--- before.txt 2019-03-17 05:49:40.999604753 +0000
+++ after.txt 2019-03-17 05:49:32.222559158 +0000
@@ -1,9 +1,9 @@
Features for rockpro64eth1g:
-rx-checksumming: on
-tx-checksumming: on
- tx-checksum-ipv4: on
+rx-checksumming: off
+tx-checksumming: off
+ tx-checksum-ipv4: off
tx-checksum-ip-generic: off [fixed]
- tx-checksum-ipv6: on
+ tx-checksum-ipv6: off
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
There are no issues with v4 checksums as far as I can tell.
@bobrik
Sadly ethtool
does not have any option to limit rx/tx to IPv6 only: https://manpages.debian.org/stretch/ethtool/ethtool.8.en.html
However this would need to be verified on all RK3399 boards and we would need to be very sure, since it's network related, thus missing critical π. If you find a way to disable and test only IPv6 offloading, this would be great.
I just reread the other topic and the issue does not seem to be limited to IPv6: https://github.com/ayufan-rock64/linux-build/issues/263#issuecomment-421141579
@carlosedp If you find time, could you verify that this is required for IPv4 as well?
I can definitely disable just tx-checksum-ipv6
on Debian Stretch and Linux 4.19.26:
ivan@rockpro64:~$ sudo ethtool -K rockpro64eth1g tx-checksum-ipv6 on
Cannot get device udp-fragmentation-offload settings: Operation not supported
Cannot get device udp-fragmentation-offload settings: Operation not supported
ivan@rockpro64:~$ sudo ethtool -k rockpro64eth1g > before.txt
Cannot get device udp-fragmentation-offload settings: Operation not supported
ivan@rockpro64:~$ sudo ethtool -K rockpro64eth1g tx-checksum-ipv6 off
Cannot get device udp-fragmentation-offload settings: Operation not supported
Cannot get device udp-fragmentation-offload settings: Operation not supported
ivan@rockpro64:~$ sudo ethtool -k rockpro64eth1g > after.txt
Cannot get device udp-fragmentation-offload settings: Operation not supported
ivan@rockpro64:~$ diff -rup before.txt after.txt
--- before.txt 2019-03-17 18:01:43.734325572 +0000
+++ after.txt 2019-03-17 18:01:51.229361706 +0000
@@ -3,7 +3,7 @@ rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
- tx-checksum-ipv6: on
+ tx-checksum-ipv6: off
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
This solves the issue when running ssh rk3399 sudo dmesg -T
:
tx-checksum-ipv6: on
has hiccups in mid outputtx-checksum-ipv6: off
responds immediatelyThere is no such issue over IPv4.
@bobrik
Okay ethtool -K $IFACE rx-checksum-ipv6 off tx-checksum-ipv6 off
does the job, if required. Thanks for this. Still other users report(ed) similar issues with IPv4 so this should be verified at best by several users, especially since the setting otherwise does not hurt (?).
Hey @MichaIng, I always disable IPv6 for my boards and had problems with offloading with them, that's why I proposed disabling IPv4 offloading like Ayufan does in his repository (I believe Armbian guys do this as well).
I'd keep it disabled for both IPv4 and IPv6.
Hi @TCB13, I had problems like this while pushing some images into DockerHub (it does over HTTPS) and seems related to the offloading too.
Maybe we could try to get in touch with Rockchip, Pine64 or FriendlyArm to make this clear.
There is no such issue over IPv4.
I had the issues above with IPv4. IPv6 is disabled on the network where my device is...
@carlosedp do you know someone there? Usually FriendlyArm guys are very friendly.
Hello, Can anyone try this patch, don't need to close checksum, Keep hardware checksum functionality
@DavidOn-356 Thanks for providing this. Does this apply to RK3399 boards or Rock64 or both?
@DavidOn-356 Thanks for providing this. Does this apply to RK3399 boards or Rock64 or both?
Both i think, but i am only verifying at Rock64.
But before we start efforts to hack some fix inside, is this actually still an issue from kernel 5.X on or has it been fixed in Linux upstream?
We aim to create Buster images with Linux 5.X kernel for RK3399 and all Pine64 boards.
@MichaIng I can't speak for everyone but in Armbian with a 5.X kernel this issue still happens in NanoPI M4 and NanoPi M4v2 board with RK3399 so I believe it should also happen in DietPi. Disabling offload fixes the issue as usual.
Maybe this can be useful for you as well? https://github.com/armbian/build/pull/1736. Apparently it looks like the issue is related to setting the MTU to 1500.
@TCB13 On DietPi, disable offloading on these models is done automatically: https://github.com/MichaIng/DietPi/blob/master/rootfs/etc/network/if-up.d/dietpi-disable_offload Great to see that finally it has been solved kernel-wise, hence we should be able to remove the workaround on next Armbian kernel package release, for images with Armbian kernel packages applied at least.
Armbian kernel patch only applies to RK3399, hence Rock64 requires testing it this works now without disabling offloading: https://github.com/MichaIng/DietPi/issues/3051#issuecomment-577741682
@carlosedp Can you verify whether disabling offload is not required anymore with current kernel/our current testing images for Rock64+RK3399 devices: https://dietpi.com/downloads/images/testing/
Hey @MichaIng .. sorry about the delay. I need to grab one of my boards to test this (they are all turned off now). I'll let you know.
BTW, are you building the images for these boards using the Kernel built from Armbian? The DT fix is not in mainline for the boards but applied as a patch by Armbian on build.
For reference, I've submitted a patch to the upstream DTs fixing this so patching on each distro won't be needed if accepted. https://patchwork.kernel.org/patch/11389879/
@carlosedp
BTW, are you building the images for these boards using the Kernel built from Armbian?
Jep, all current/beta RK3399 images, but Odroid N1, are build with Armbian kernel.
For reference, I've submitted a patch to the upstream DTs fixing this so patching on each distro won't be needed if accepted.
Great step, overdue. Can't believe that such a major (IMO) issue has never been addressed by RockChip, Pine64 or any other manufacturer who uses those SoCs across their SBC lineup, especially the very famous and widely used RK3399.
@rugalash Good to know that the Armbian image does not boot as well, hence it is not something that we broke: https://github.com/MichaIng/DietPi/issues/2399#issuecomment-597918840
The current NanoPC T4 image seems to boot on M4, although the one we created was based on one kernel version earlier.
The Ethernet fix has really nothing to do with the boot issue, that is what I can say for sure since we have many fine booting images with this fix in place.
I also think it's nothing to do with the Ethernet fix. The Kernel patch is already in the for-next branch... will go in 5.7.
https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip.git/log/?h=for-next
Well, we are in the ethernet checksumming issue thread and the patch is exactly to fix that. That's why I mentioned it.
@carlosedp
The Kernel patch is already in the for-next branch... will go in 5.7.
Good to know π.
@rugalash Yes that slow boot at least is what we also observed. To fix slow Ethernet, please try:
rm /etc/network/if-up.d/dietpi-disable_offload
ethtool -K eth0 rx on tx on # or reboot instead
This is the workaround we had in place for the issue we are writing in here, which is not required anymore with Armbian kernels π. I haven't repackaged all images yet to have this removed.
Over a usb3 to ethernet adapter I'm able to get solid throughput consistently. I've used this method to take out the possibilities of it being any supply issue or anything else really. So it can be 100% related to ethernet adapter and CPU related offloading for the specific adapter.
I've said this countless times. The issue only happens with the integrated Ethernet.
@rugalash
How exactly did rsync
connect? To a running rsync daemon on the target machine, via SFTP or SCP?
Dropbear does not include any file transfer protocol, but it automatically invokes OpenSSH SFTP and SCP binaries, if present. AFAIK after the authentication has been done, the actual file transfer should then work exactly the same as with OpenSSH server, as long as the same protocol and same binaries are used.
E.g. to enable OpenSSH SFTP+SCP via Dropbear:
apt install openssh-client # Provides the `scp` command
apt install openssh-sftp-server # Provies `sftp-server` command
I've enabled the offload and the network seems about the same, so his patch is enabled in this image already?
Yeah, I guess you might only see a slightly reduced CPU usage on onboard Ethernet load. And yes the patch is part of the Armbian 5.X kernel which we ship with current RK3399+Rock64 beta images.
@rugalash Ah lol I guess it was too late yesterday. rsync of course calls rsync command on the remote host and not any other file transfer system.
I'll see if I can replicate the higher CPU load > limited transfer rate on Dropbear compared to OpenSSH.
Hi all, the patch has been merged into Linus tree and will be on 5.7.
Hi @carlosedp,
This also affects RK3288 based ARM boards. I've applied a similar patch to arch/arm/boot/dts/rk3288.dtsi
and it seems to work like a charm (no more retransmissions and reset errors until now)! π
I know I'm late to the game but may I ask if you have/had a reproducible way to test the fix you proposed? I'd be thrilled to submit a patch to the kernel but would like to be 100% sure it solves the problem.
Slightly unrelated, how did you find the <0x4>
value? You describe it in the commit as "a safe number tested ok" but I can't really find any information about this (and what other values might be acceptable).
Cheers!
After a lot (I mean, a lot) of investigation on TCP resets, Ack errors and Dups, I found the problem that I describe while pushing a Docker image to DockerHub here: https://github.com/moby/moby/issues/37642
The boards like RK3399 need TCP/UDP offloading disabled to avoid the retransmissions and reset errors. This was already implemented by Ayufan on Rock64 and RockPro64 Rootfs and DietPi needs this too.
The configuration file below should go to
/etc/network/if-up.d
to disable offloading on this boards.This file above I got from Ayufan's repo but the device "ff540000" is for Rock64 SBC. For RK3399 boards (I checked Firefly NanoPC-T4 and FriendlyArm RK3399) the identifier is:
/sys/devices/platform/fe300000.ethernet/net/$IFACE
. I believe both should be there.I will check if RockPro64 ID also changed later.