Open bb-qq opened 1 year ago
In my case the problem occurs only when transferring large files from NAS to PC. I conversely can navigate the NAS directory structure for hours without the problem occurs.
One side note: after installation I immediately changed the MTU size to 9000 in the NAS LAN configuration. After I found the problem, I tried to reset MTU to 1500 (default), also unchecking the manual setting, but after saving this setting it still remains enabled with the value of 9000, at least as shown in the GUI. No way to reset. But it may be only a GUI issue.
Anyway, because of the evidence that only large transfers cause the problem, it might have something to do with the MTU?
Anyway, because of the evidence that only large transfers cause the problem, it might have something to do with the MTU?
MTU may have something to do with this stability issue, but there are reports of problems occurring even with MTU values of 1500.
It might possibly relate to the hardware-assisted functions of the transmission on the NIC side.
I would like to know if disabling those features by the following command will make a difference in stability.
ethtool -K eth2 tso off
ethtool -K eth2 gso off
ethtool -K eth2 sg off
ethtool -K eth2 tso off ethtool -K eth2 gso off ethtool -K eth2 sg off
Thanks, going to try.
How to check which are the current values before issueing these commands?
And are these new values reversed upon NAS restart or are they persistent?
I tested the connection with the suggested changes. It's a pity, but nothing has changed. DS218 & rtl8156 2.5
I also tried ipv6 access https://[fe80::XXXX:XXXX:XXXX:XXXX]:5001/ I tried to download the file and it didn't help either
DS920+ 2.16.3-3 DSM7.x (reuploaded) lan rtd1296 (ks-is ks-714) https://ks-is.com/usb-3-1-ethernet-adapter-ks-is-ks-714?tag=2.5G
There are no problems with data transfer. Especially for a couple of hours I drove chia 100gb plots at a speed of 2.5. But there is another problem! When you pull out and put back the adapter, the driver turns off. The same goes for rebooting. Must be manually enabled in the web interface.
rtd1296 is the cpu for entry level synology, but not for DS920+.
ethtool -K eth2 tso off ethtool -K eth2 gso off ethtool -K eth2 sg off
Using a DS418 with a TRENDnet TUC-ET2G and the r8152 driver. This managed to get me 2.5Gb speeds briefly (and for longer than it would previously hold a connection at all), but the connection ultimately shut down. It does seem like I was getting 2500mbps upload and only about 1000mbps download.
Has there been any updates for DS418 with TRENDnet 2.5G USB-C to RJ-45? I got this and thought before I looked on here. My expectation was this was going to work. Yet I am seeing the issues with the drivers above. I ran the SSH after it failed and then saw the connection under network. Connected yet it was a 169. address . I am also using a TRENDnet 5-Port Unmanaged 2.5G PoE+switch with its own AC Adapter. After a reboot its completely gone. I had to run the RT App when I rebooted as it did not auto restart.
After assigning a static IP, I am now showing: 2500mbps Full Duplex 1500 MTU
I will put it to test with a few file transfers small and large tomorrow when I get up.
Thanks all, it looks like disabling GSO and TSO didn't make much difference in stability.
These settings will revert after reboot. If they have any effect, please register them in the task scheduler or something so that they are configured at startup.
Same here with my 218play. I tried two different adapters with 8152 chipset (none of the recommended adapters yet). After some research, I found on the internet that the error is very common. It can be seen well in the /var/log/kern.log. Unfortunately, I could not find a solution to the problem. The error occurs with large amounts of data. The connection is interrupted for about 45 seconds, the NAS is then also not accessible via ping.
This is what the kern.log looks like: 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.442279] r8152 3-1:1.0 eth1: Tx timeout 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.448949] r8152 3-1:1.0 eth1: Tx status -2 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.453431] r8152 3-1:1.0 eth1: Tx status -2 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.457911] r8152 3-1:1.0 eth1: Tx status -2 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.462397] r8152 3-1:1.0 eth1: Tx status -2 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.434430] r8152 3-1:1.0 eth1: get_registers -108 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.439501] r8152 3-1:1.0 eth1: get_registers -71 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.444479] r8152 3-1:1.0 eth1: get_registers -71 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.449441] r8152 3-1:1.0 eth1: get_registers -71 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.454439] r8152 3-1:1.0 eth1: get_registers -71 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.459401] r8152 3-1:1.0 eth1: get_registers -71
I operate it with 1 Gbps, not with 2,5 Gbps. So Your workaround ("May operate stably when linked at 1 Gbps") has no effect here.
Next week I will get the club 3D USB adapter with 8156 chipset. I will test it and report if the error also occurs.
As written before, there are some articles and forums about this topic, here are some of them, don't know if it could help in our environment:
https://portal.cloudunboxed.net/knowledgebase/55/How-to-fix-Realtek-USB-NIC-TX-timeout-issues.html
https://forum.odroid.com/viewtopic.php?f=212&t=45857
https://bugzilla.kernel.org/show_bug.cgi?id=198931
And by the way: I tested another adapter with 8169 chipset (together with Your 8152 driver). It worked, but with a poor performance (about 30 MB/s).
Coming back to test my NAS DS418 with my Trendnet 2.5gbe setup I saw where my connection showed connected still but I had no ping and the port was non-responsive. I saw the mac address but no even after several reboots, uninstall, reinstall etc... I read some other places where this has occurred so I had to dust off my old Linux had and found a short remedy for this. I did notice regardless of me setting MTU to 9000 in the GUI it still is showing up as MTU 1500.
sudo /etc/rc.network restart
My connection back up , IP now showing and pingable. eth2 Link encap:Ethernet HWaddr 3C:8C:F8:60:0A:94 inet addr:192.168.1.201 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:59735 errors:0 dropped:0 overruns:0 frame:0 TX packets:39 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3237395 (3.0 MiB) TX bytes:9563 (9.3 KiB)
Now I will go forward with my testing.
Coming back to test my NAS DS418 with my Trendnet 2.5gbe setup I saw where my connection showed connected still but I had no ping and the port was non-responsive. I saw the mac address but no even after several reboots, uninstall, reinstall etc... I read some other places where this has occurred so I had to dust off my old Linux had and found a short remedy for this. I did notice regardless of me setting MTU to 9000 in the GUI it still is showing up as MTU 1500.
sudo /etc/rc.network restart
Yes, this is ONE way. For me it works to stop and restart the installed driver in the package center ... But this isn't a workaround as long as I won't be able to download any file from the NAS.
Just tested with Club 3D USB adapter. Test failed. Upload speed is a disaster (worst of all devices).
And downloads don't start at all. 2023-01-10T17:21:49+01:00 diskstation kernel: [806423.856275] r8152 3-1:1.0 eth1: Tx timeout 2023-01-10T17:21:49+01:00 diskstation kernel: [806423.862963] r8152 3-1:1.0 eth1: Tx status -2 2023-01-10T17:21:51+01:00 diskstation kernel: [806425.820037] r8152 3-1:1.0 eth1: get_registers -108 2023-01-10T17:21:51+01:00 diskstation kernel: [806425.825106] r8152 3-1:1.0 eth1: get_registers -71 2023-01-10T17:21:51+01:00 diskstation kernel: [806425.870979] xhci-hcd xhci-hcd.2.auto: URB transfer length is wrong, xHC issue? req. len = 4, act. len = 4294967292
So: None of my 3 different adapters do the trick.
For now I use 2 connections: LAN for download from NAS, USB for uploads. I observe quite a good performance for uploads, about 80-120 MB/s (well, of cause this is no 2,5 gps speed, but more than before and might be limited by my HDDs) with this adapter: https://www.digitec.ch/de/s1/product/digitus-usb-type-c-gigabit-ethernet-adapter-25g-usb-c-usb-a-usb3130-usb-c-usb-31-netzwerkadapter-16185124 As written before: As far as I try any download via USB, the USB connection crashes and I will have to stop and restart the driver in the package center.
Driver version 2.15.0-10 tested last night on 418j, and met with the same issue when download large file over 500mb from nas. Download failed and Nas show no response to ping. On my case, reconnect the lan wire between nas and router do fix the No-response situation, But The download issue is repeatable. I tried linking the nas and PC directly with one wire, And met with the same issue. Link the wire to the onboard 1g port of the nas, and everything is ok.
I tested both adapters from CableMatters and ASUS, on DS220j and got the same behavior, uploads work as expected but downloads make the adapter hang. Tested with iperf3.
Same here with DS220j. Uploading goes fine even with large files. Downloading it breaks instantaneously. Testing adapter with chipset RTL8156B.
The same problem on model DS220j, like many who have already written here. Works unstable. Especially if you give the load on the interface. If it helps in any way, I could help with testing and even provide access to my device, for example, through a mesh network.
Did someone follow these instructions named "How to fix Realtek USB NIC TX timeout issues"? https://portal.cloudunboxed.net/knowledgebase/55/How-to-fix-Realtek-USB-NIC-TX-timeout-issues.html
I am not THAT big linux specialist and hesitate to try this workaround ...
Did someone follow these instructions named "How to fix Realtek USB NIC TX timeout issues"? https://portal.cloudunboxed.net/knowledgebase/55/How-to-fix-Realtek-USB-NIC-TX-timeout-issues.html
I am not THAT big linux specialist and hesitate to try this workaround ...
The kernel is already disabling AutoSuspend USB Power Mode:
$ cat /sys/module/usbcore/parameters/autosuspend -1
I can give it a whirl this afternoon. I am pretty savvy with Linux. I’ll post feedback.
On Mon, Feb 27, 2023 at 2:03 PM Romeo1984 @.***> wrote:
Did someone follow these instructions named "How to fix Realtek USB NIC TX timeout issues"? https://portal.cloudunboxed.net/knowledgebase/55/How-to-fix-Realtek-USB-NIC-TX-timeout-issues.html
I am not THAT big linux specialist and hesitate to try this workaround ...
The kernel is already disabling AutoSuspend USB Power Mode:
$ cat /sys/module/usbcore/parameters/autosuspend -1
— Reply to this email directly, view it on GitHub https://github.com/bb-qq/r8152/issues/275#issuecomment-1446892503, or unsubscribe https://github.com/notifications/unsubscribe-auth/APKQWR4M5IY2O3Z4KJ35XALWZT27LANCNFSM6AAAAAASTCUFFE . You are receiving this because you commented.Message ID: @.***>
I looked at this. Synology OS does not use Grub. I couldn’t find any articles on how to pass kernel parameters.
I hadn’t had a chance to look at it. Grub is not supported. Is that the only way the article gives?
On Mon, Feb 27, 2023 at 4:36 PM Romeo1984 @.***> wrote:
I looked at this. Synology OS does not use Grub. I couldn’t find any articles on how to pass kernel parameters.
— Reply to this email directly, view it on GitHub https://github.com/bb-qq/r8152/issues/275#issuecomment-1447130140, or unsubscribe https://github.com/notifications/unsubscribe-auth/APKQWR6FUQUSLTE7NK4VFNLWZUM4JANCNFSM6AAAAAASTCUFFE . You are receiving this because you commented.Message ID: @.***>
Yes.
I do have a question. Are you guys directing specific traffic to the USB 3 port or did you bind the port?
On Mon, Feb 27, 2023 at 4:54 PM Romeo1984 @.***> wrote:
Yes.
— Reply to this email directly, view it on GitHub https://github.com/bb-qq/r8152/issues/275#issuecomment-1447161193, or unsubscribe https://github.com/notifications/unsubscribe-auth/APKQWRZA3Q4S4FAMLO7U4V3WZUPA7ANCNFSM6AAAAAASTCUFFE . You are receiving this because you commented.Message ID: @.***>
I have not configured anything else. In order to be able to use the higher speed at all, I use the second IP address (USB3) for uploading files (films, pictures) to my NAS from the network, and for any download, i.e. the normal retrieval of files, I use the normal LAN port.
Potentially Stupid question: Has anybody tried running the 2.5Ge adapter from a powered USB hub? I have this running perfectly on my DS720+ this way. I am wondering if this is a power issue somehow. I heard reports that the DS220j has underpowered USB ports. I would like to hear if somebody has tried it, if not, I might order the same one I am running on my DS720+ and just try it.
I just tried using a powered Dell usb dock and same result: uploads are fine, with downloads breaks. Weird thing is that internally you can still ping the 2.5Gbe but not from outside. After ifconfing eth1 down and up, it comes to normal again.
Confirmed - I too tried a powered USB doc with the same result.
I have seen people reporting DS218j is working fine with an Asus dongle, both for upload and downloads. Is DS220j so different from it? I know DS218j is Armada38x but still, it is kind of weird it works so bad for our DS220j with regular RTL8156B usbs.
Yes. The Armada chipsets are reported to be working fine. The DS220J uses the Realtek RTD1296 chipset. Completely different from some of the other "J" models.
Affected Models using this chipset:
DS420j
DS220j
RS819
DS418
DS218
DS218play
DS118
Reference: https://kb.synology.com/en-global/DSM/tutorial/What_kind_of_CPU_does_my_NAS_have
@bb-qq How can I "Help" with this? I have Linux experience, and two other Synology models: DS720+ and a DS214Play.
I've got an ioSafe 218 (essentially a more rugged DS218). It's got 2GB of RAM - the same as many of the units which are stable, but have Intel CPUs. The driver crashes when trying to download (even fairly small ... 520MB), this is with it connected to the rear ports where the link comes up at 2.5GbE. However, if I plug it into the front port the link comes up at 1GbE and appears to be stable ... although of course that's no faster than the built-in NIC. I was able to download a 11GB file - but speed was poor (circa 35MB/s). Downloading the same file over the internal NIC (also at 1GbE) does over 100MB/s. An upload over the 2.5GbE (connected to the front port) topped out at just under 40MB/s, with it in the one of the rear ports I saw a peak of 95MB/s. So it would appear to be CPU/driver related, rather than RAM ... ?
ethtool -K eth2 tso off ethtool -K eth2 gso off ethtool -K eth2 sg off
actually this did make it stable for me, except the speed was dropped below 1Mbps. (for me, it is eth1 with sudo of course).
It does show as a 2.5G link, dhcp doesn't work still. (DS218 on rear usb 3.0). Perhaps it may be useful to try these settings one by one?
Hm... i take it back. switching scatter-gather off seems to also switch tso and gso off automatically. With scatter gather off, the speed drops, which seems to improve stability, but if it is run long enough, it eventually still gets stuck.
With scatter gather on speed is high, and switching tso or or gso off does not change anything, speed is high, but the crash is much easier to reproduce. (I have almost convinced myself to just drop $300 on a new DS220+ shell and move on).
Played a little bit more with this on ds218. No parameters in the usbcore module made any difference (except for changing autosuspend which causes the interface go defunct right away if set !=-1). Changing ring made no difference either.
Same symptoms in the logs as in this thread: everything starts with a tx timeout. Driver tries to send a usb reset call in response to that, and everything goes downhill from there, errors in bulk tx callback, and eventually not being able to read the registers. The TX timeout value is set 5*HZ, and i wonder if that is materially different in this chipset, maybe it makes sense to bump it up for bulk frames.
Also noticed that the driver file has a slightly different line count from the 2.16.3 i had downloaded from the realtek site. I assume all changes from the source are benign, or I am mistaken and there are no changes.
@bb-qq How can I "Help" with this? I have Linux experience, and two other Synology models: DS720+ and a DS214Play.
I don't know whether this helps...?
Thank you all for the information you have provided. Unfortunately, I have not yet found a way to improve stability. The problem seems to be occurring at the lower layers and I have no idea where to start looking.
However, I noticed that the recently released DSM7.2beta has a new Linux kernel version.
$ head -6 ds.rtd1296-7.1/usr/local/sysroot/usr/include/linux/syno_autoconf.h
/*
*
* Automatically generated file; DO NOT EDIT.
* Linux/arm64 4.4.180 Kernel Configuration
*
*/
$ head -6 ds.rtd1296-7.2/usr/local/sysroot/usr/include/linux/syno_autoconf.h
/*
*
* Automatically generated file; DO NOT EDIT.
* Linux/arm64 4.4.302 Kernel Configuration
*
*/
The kernel update is unlikely to improve anything, but if anyone has tried it, please let me know. Packages compatible with DSM 7.2 are available here. https://github.com/bb-qq/r8152/releases/tag/2.16.3-4
The kernel update is unlikely to improve anything, but if anyone has tried it, please let me know. Packages compatible with DSM 7.2 are available here. https://github.com/bb-qq/r8152/releases/tag/2.16.3-4
As mentioned in Comment #96 of the bugzilla link I also posted 5 days ago, a similar issue has been resolved when the comment author updated to Kernel 5.16 in Debian. It may be completely unrelated, but it is evidence that kernel updates sometimes may indeed solve issues like this. Unfortunately the DSM is still a long way from kernel 5.xx.
I tried with DSM7.2 RC and the new 7.2 release of the driver. Unfortunately i must report that it was stable for copying about 3Gb before it failed in the same manner as before. As before, it is double failure, as usb reset command does not recover the driver state.
To be specific, I was running DSM 7.2-64551.
So I had to uninstall and reinstall the driver and now it says it failed to start. I have the latest update 7.2 which I am suspecting may have some changes that interferes with the install now. Can someone confirm? Now it keeps asking for me to repair it.
So I had to uninstall and reinstall the driver and now it says it failed to start. I have the latest update 7.2 which I am suspecting may have some changes that interferes with the install now. Can someone confirm? Now it keeps asking for me to repair it.
Same here. I just installed DSM 7,2 final version, had to uninstall the old driver and tried to install [2.16.3-4] and now it asks me to repair the driver. Do we have to execute the SSH command once more to get the new driver working? Upate: The driver works now, after applying this again: sudo install -m 4755 -o root -D /var/packages/r8152/target/r8152/spk_su /opt/sbin/spk_su
I tried the command once it failed as the instructions called for as well. It appears something changed in the new release. We need an updated script. Hopefully the dev sees our conversation. On Mon, May 22, 2023 at 6:37 AM Dayofwonder @.***> wrote:
So I had to uninstall and reinstall the driver and now it says it failed to start. I have the latest update 7.2 which I am suspecting may have some changes that interferes with the install now. Can someone confirm? Now it keeps asking for me to repair it.
Same here. I just installed DSM 7,2 final version, had to uninstall the old driver and tried to install [2.16.3-4] and now it asks me to repair the driver. Do we have to execute the SSH command once more to get the new driver working? I cannot test it, because I am away for some days. sudo install -m 4755 -o root -D /var/packages/r8152/target/r8152/spk_su /opt/sbin/spk_su
— Reply to this email directly, view it on GitHub https://github.com/bb-qq/r8152/issues/275#issuecomment-1556980303, or unsubscribe https://github.com/notifications/unsubscribe-auth/APKQWR3WVI4Z6S2BP3ZLZFLXHM6WRANCNFSM6AAAAAASTCUFFE . You are receiving this because you commented.Message ID: @.***>
I tried the command once it failed as the instructions called for as well. It appears something changed in the new release. We need an updated script. Hopefully the dev sees our conversation.
As added in my post above, it works for me now. I installed the driver with an error, executed the sudo command and tried to install the driver again successfully. Connection is up now, I will test it later on. By the way: My DSM 7.2 is the final version, not a beta or RC version.
On 7.2-64561, unfortunately, it is still broken the same way. I did download the record 5+Gb before it crashed though, but alas.
The Realtek driver from which this package is based has been updated to 2.17.1. It does not seem to contain any changes that might improve the stability with the rtd1296, but you can try it if you like. https://github.com/bb-qq/r8152/releases/tag/2.17.1
Synology DS218 DSM 7.2-64570 Update 1
r8152-rtd1296-2.17.1-1_7.2.spk
Asus USB-C2500 USB Type-A 2.5G Base-T Ethernet Adapter https://www.asus.com/networking-iot-servers/wired-networking/wired-adapters/usb-c2500/
Installation as suggested NAS rebooted after installation Fixed IP assigned to the USB dongle 1500 MTU
Internal 1 GB ethernet also connected using a different static IP.
Tests (samba) UPLOAD (virtual machines) 1) 15 GB VM (1 15 GB file + small files) 2) 15 GB VM (1 15 GB file + small files) 3) 87 GB VM (1 55 GB file + 1 27 GB file + small files) all OK, upload speed around 160-170 MB/s
DOWNLOAD tested download of many files of various sizes, I was successful when downloading files up to 22 MB, but larger files caused a network interface lock with no further file access, which could be resolved stopping and restarting the driver in the Package Center.
I'm not sure if it could be the cause, but recently a new update of the SMB service was released.
Version: 4.15.13-0871 (2023-09-26) Fixed Issues
Fixed an issue where certain clients could cause continuous increase in SMB memory usage.
Perhaps this could have indirectly influenced it.
Hi Any update ? I want to add 2.5Gbit to my DS218play, but seems to still not working or ?
Anyone know if this also effects RTD1619B? If I can't get it working on my current DS220j I might have better luck with a newer DS223j?
This issue summarizes the topic of the driver not working on the rtd1296 platform.
There are many reports of unstable operation in products using rtd1296. The typical symptoms reported are as follows
ethtool -s ethX speed 1000 duplex full
There are also no reports of stable operation.
When disconnected, there seems to be something wrong at the USB level. This may indicate that the rtd1296 SoC may have some software or hardware issues with the xHCI host controller.
I am looking for a workaround for this problem, but so far have not found it. (I am also considering providing a standard usb-cdc driver separately.) I will report here if any progress is made in the investigation.
Affected Products