clearlinux / distribution

Placeholder repository to allow filing of general bugs/issues/etc against the Clear Linux OS for Intel Architecture linux distribution
524 stars 29 forks source link

Issue with Realtek rtl8168h Ethernet Interface since last auto update #3018

Open LaurenceGough opened 10 months ago

LaurenceGough commented 10 months ago

Hello,

I have spent many hours today investigating this issue. I have a ClearLinux server on a mini PC. This PC has a Realtek rtl8168h Ethernet Interface.

Ever since what I assume was the last automatic system update I cannot run anything which puts a medium to high load on the network interface. CPU and local processing is all fine. No other changes have been made apart from automatic ones. I can do very minor network tasks, but the second you put load on it such as downloading a bundle or running a speed test it goes.

There are no related logs that I could find in the journal.

version: 6.6.9-1394.native
firmware-version: rtl8168h-2_0.0.2 02/26/15
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

lsmod | grep r8169
r8169                 135168  0
mdio_devres            12288  1 r8169
libphy                225280  3 r8169,mdio_devres,realtek

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)

Is the info on it.

Running a speedtest, or running any container that does moderate to high network traffic causes the vast majority of pings to drop (sometimes it's 4-5 seconds per a ping) to and from the device for up to 5 minutes (depending on how long the attempt is). 100% repeatable every time. I have confirmed everything else on the network is fine. Pings to the gateway are just fine at the same time. SSH goes down of course so I am having to console.

Perhaps an issue with the latest Stable 6.6.9-1394 Linux Kernel? Doing research on this issue finds nothing at all.

For reference the Ubuntu live USB is running: 6.2.0-26-generic Driver r8169 Version 6.2.0-26-generic Firmware version rtl8168h-2_0.0.2 02/26/15 (same)

lsmod | grep r8169 R8169 114688 0

I have followed the instructions here and various other repairs but no luck. https://github.com/clearlinux/clear-linux-documentation/blob/master/source/guides/maintenance/fix-broken-install.rst

Any help would be much appreciated.

Thanks,

Laurence

fenrus75 commented 10 months ago

once you boot and basic packets will flow, I doubt it's anything repair wise that would solve it

the one thing we recently changed as well is some network performance tuning

can you try these two things (together) and see if that makes it go away?

echo 0 > /proc/sys/net/core/busy_poll echo 0 > /proc/sys/net/core/busy_read

On Wed, Jan 3, 2024 at 2:20 PM LaurenceGough @.***> wrote:

Hello,

I have spent many hours today investigating this issue. I have a ClearLinux server on a mini PC. This PC has a Realtek rtl8168h Ethernet Interface.

Ever since what I assume was the last automatic system update I cannot run anything which puts a medium to high load on the network interface. CPU and local processing is all fine. No other changes have been made apart from automatic ones.

Testing with Ubuntu live USB does not have the issue. Using a USB C Ethernet adapter (same cable and port) does not have the issue. Using the live ClearLinux server USB bootable has the exact same issue. There are no related logs that I could find in the journal.

version: 6.6.9-1394.native firmware-version: rtl8168h-2_0.0.2 02/26/15 expansion-rom-version: bus-info: 0000:01:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no

lsmod | grep r8169 r8169 135168 0 mdio_devres 12288 1 r8169 libphy 225280 3 r8169,mdio_devres,realtek

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)

Is the info on it.

Running a speedtest, or running any container that does moderate to high network traffic causes the vast majority of pings to drop (sometimes it's 4-5 seconds per a ping) to and from the device for up to 5 minutes (depending on how long the attempt is). 100% repeatable every time. I have confirmed everything else on the network is fine. Pings to the gateway are just fine at the same time. SSH goes down of course so I am having to console.

Perhaps an issue with the latest Stable 6.6.9-1394 Linux Kernel? Doing research on this issue finds nothing at all.

I have followed the instructions here and various other repairs but no luck. https://github.com/clearlinux/clear-linux-documentation/blob/master/source/guides/maintenance/fix-broken-install.rst

Any help would be much appreciated.

Thanks,

Laurence

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3018, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FK7PHNR2BZKMI7JFXLYMXKR5AVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3DINZSGA3TAMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

LaurenceGough commented 10 months ago

once you boot and basic packets will flow, I doubt it's anything repair wise that would solve it the one thing we recently changed as well is some network performance tuning can you try these two things (together) and see if that makes it go away? echo 0 > /proc/sys/net/core/busy_poll echo 0 > /proc/sys/net/core/busy_read On Wed, Jan 3, 2024 at 2:20 PM LaurenceGough @.> wrote: Hello, I have spent many hours today investigating this issue. I have a ClearLinux server on a mini PC. This PC has a Realtek rtl8168h Ethernet Interface. Ever since what I assume was the last automatic system update I cannot run anything which puts a medium to high load on the network interface. CPU and local processing is all fine. No other changes have been made apart from automatic ones. Testing with Ubuntu live USB does not have the issue. Using a USB C Ethernet adapter (same cable and port) does not have the issue. Using the live ClearLinux server USB bootable has the exact same issue. There are no related logs that I could find in the journal. version: 6.6.9-1394.native firmware-version: rtl8168h-2_0.0.2 02/26/15 expansion-rom-version: bus-info: 0000:01:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no lsmod | grep r8169 r8169 135168 0 mdio_devres 12288 1 r8169 libphy 225280 3 r8169,mdio_devres,realtek 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15) Is the info on it. Running a speedtest, or running any container that does moderate to high network traffic causes the vast majority of pings to drop (sometimes it's 4-5 seconds per a ping) to and from the device for up to 5 minutes (depending on how long the attempt is). 100% repeatable every time. I have confirmed everything else on the network is fine. Pings to the gateway are just fine at the same time. SSH goes down of course so I am having to console. Perhaps an issue with the latest Stable 6.6.9-1394 Linux Kernel? Doing research on this issue finds nothing at all. I have followed the instructions here and various other repairs but no luck. https://github.com/clearlinux/clear-linux-documentation/blob/master/source/guides/maintenance/fix-broken-install.rst Any help would be much appreciated. Thanks, Laurence — Reply to this email directly, view it on GitHub <#3018>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FK7PHNR2BZKMI7JFXLYMXKR5AVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3DINZSGA3TAMY . You are receiving this because you are subscribed to this thread.Message ID: @.>

Where were you 10 hours ago? ;) ;) Hehe....

Problem solved now, I've noticed these changes are not permanent and go after a reboot, so I will look at editing the sysctl.conf to add them.

Ping times are now a rock solid <1ms as it should be. I must admit... I am not too keen on this network tuning... It has caused me to reset everything, unfortunately I've lost all of my BIOS settings and system tweaks but life goes on! I believe I am on the stable build??? if there is a more stable one such as a LTS please let me know!

Many thanks,

Laurence

(I haven't closed in case you would like me to test something else etc, please feel free to close).

fenrus75 commented 10 months ago

the tuning is turning on a feature called "NAPI" .... which is supposed to help network performance under high load (well not just supposed, it does in our measurements)

however this needs device driver code and it appears the 8169 driver is buggy here

I'll patch our 8169 driver to not turn on NAPI in our next release so that this is permanent...

On Wed, Jan 3, 2024 at 4:34 PM LaurenceGough @.***> wrote:

once you boot and basic packets will flow, I doubt it's anything repair wise that would solve it the one thing we recently changed as well is some network performance tuning can you try these two things (together) and see if that makes it go away? echo 0 > /proc/sys/net/core/busy_poll echo 0 > /proc/sys/net/core/busy_read … <#m3932653563667953231> On Wed, Jan 3, 2024 at 2:20 PM LaurenceGough @.> wrote: Hello, I have spent many hours today investigating this issue. I have a ClearLinux server on a mini PC. This PC has a Realtek rtl8168h Ethernet Interface. Ever since what I assume was the last automatic system update I cannot run anything which puts a medium to high load on the network interface. CPU and local processing is all fine. No other changes have been made apart from automatic ones. Testing with Ubuntu live USB does not have the issue. Using a USB C Ethernet adapter (same cable and port) does not have the issue. Using the live ClearLinux server USB bootable has the exact same issue. There are no related logs that I could find in the journal. version: 6.6.9-1394.native firmware-version: rtl8168h-2_0.0.2 02/26/15 expansion-rom-version: bus-info: 0000:01:00.0 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no lsmod | grep r8169 r8169 135168 0 mdio_devres 12288 1 r8169 libphy 225280 3 r8169,mdio_devres,realtek 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15) Is the info on it. Running a speedtest, or running any container that does moderate to high network traffic causes the vast majority of pings to drop (sometimes it's 4-5 seconds per a ping) to and from the device for up to 5 minutes (depending on how long the attempt is). 100% repeatable every time. I have confirmed everything else on the network is fine. Pings to the gateway are just fine at the same time. SSH goes down of course so I am having to console. Perhaps an issue with the latest Stable 6.6.9-1394 Linux Kernel? Doing research on this issue finds nothing at all. I have followed the instructions here and various other repairs but no luck. https://github.com/clearlinux/clear-linux-documentation/blob/master/source/guides/maintenance/fix-broken-install.rst https://github.com/clearlinux/clear-linux-documentation/blob/master/source/guides/maintenance/fix-broken-install.rst Any help would be much appreciated. Thanks, Laurence — Reply to this email directly, view it on GitHub <#3018 https://github.com/clearlinux/distribution/issues/3018>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FK7PHNR2BZKMI7JFXLYMXKR5AVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3DINZSGA3TAMY https://github.com/notifications/unsubscribe-auth/AAJ54FK7PHNR2BZKMI7JFXLYMXKR5AVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3DINZSGA3TAMY . You are receiving this because you are subscribed to this thread.Message ID: @.>

Where were you 10 hours ago? ;) ;) Hehe....

Problem solved now, I've noticed these changes are not permanent and go after a reboot, so I will look at editing the sysctl.conf to add them.

Ping times are now a rock solid <1ms as it should be. I must admit... I am not too keen on this network tuning... It has caused me to reset everything, unfortunately I've lost all of my BIOS settings and system tweaks but life goes on!

Many thanks,

Laurence

(I haven't closed in case you would like me to test something else etc, please feel free to close).

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3018#issuecomment-1876153339, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FIK6UUEF4BV4GFVT3TYMX2JTAVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZWGE2TGMZTHE . You are receiving this because you commented.Message ID: @.***>

LaurenceGough commented 10 months ago

Many thanks for that. I am having real trouble getting these changes to stick, they keep reverting after a reboot. Would you have any tips? As it becomes unusable without these changes (when any containers are running) I need them to stick.

Thanks again

fenrus75 commented 10 months ago

it's done by the clr-power-tweaks systemd service (could be with _ .. on my cell phone so can't easily check)

if you systemctl disable that service it'll stick

On Wed, Jan 3, 2024, 17:19 LaurenceGough @.***> wrote:

Many thanks for that. I am having real trouble getting these changes to stick, they keep reverting after a reboot. Would you have any tips? As it becomes unusable without these changes (when any containers are running) I need them to stick.

Thanks again

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3018#issuecomment-1876181833, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FJMYEZZQCOYSIHK2B3YMX7TDAVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZWGE4DCOBTGM . You are receiving this because you commented.Message ID: @.***>

lhilden commented 10 months ago

@fenrus75 can you confirm this is fixed in 40610?

Changes in package linux (from 6.6.9-1394 to 6.6.9-1395):
     Arjan van de Ven - version bump from 6.6.9-1394 to 6.6.9-1395
     Arjan van de Ven - disable napi on realtek based on issue 3018
fenrus75 commented 10 months ago

it'll be fixed in the next -- 610 has only the first one disabled (which might be enough but might not be)

On Thu, Jan 4, 2024 at 9:49 AM Louis Hilden @.***> wrote:

@fenrus75 https://github.com/fenrus75 can you confirm this is fixed in https://cdn.download.clearlinux.org/releases/40610/clear/RELEASENOTES

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3018#issuecomment-1877520341, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FNMYKX6U6VTB7BUFLTYM3TTRAVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZXGUZDAMZUGE . You are receiving this because you were mentioned.Message ID: @.***>

K1ngfish3r commented 10 months ago

IMG20240105132722 Decided to checkout 40610, turns out it's completely borked for me, can't access any sudo commands, Decided to fresh install, and am ending up at the same result IMG20240105154617 IMG20240105155227

jcm-01 commented 10 months ago

Hello,

I have a mini-PC with a Realtek RTL8168 NIC and have also been struggling for a few days, and changing "busy_poll" and "busy_read" to 0 resolves the issue until the next reboot.

Yesterday evening, I tried installing 40610, and the installation environment failed to load, stopping at this point:

J5Qqu6X.png

For comparison sake, here is 40600's installation environment, which loads successfully, albeit still with the network issue:

J5QqTGt.png

I also want to mention that I proceeded with the 40600 installation using the fix, but when I tried to implement the fix again on the first boot, the system became unusable after running "sudo systemctl restart NetworkManager". By unusable, I mean it seemed like the command made the system hang without any error messages, and using CTRL+C or CTRL+ALT+DEL did nothing.

I hope this helps.

fenrus75 commented 10 months ago

hmm this is a bit sad in that it means we can't enable NAPI by default because it breaks on the realtek nics ... even though it gives a nice perf boost for other nics :(

I'm undoing all the tuning at this point -- we are not really able to only apply this for !realtek in how we do our tuning

On Fri, Jan 5, 2024 at 2:27 AM JC @.***> wrote:

Hello,

I have a mini-PC with a Realtek RTL8168 NIC and have also been struggling for a few days, and changing "busy_poll" and "busy_read" to 0 resolves the issue until the next reboot.

Yesterday evening, I tried installing 40610, and the installation environment failed to load, stopping at this point:

[image: J5Qqu6X.png] https://freeimage.host/

For comparison sake, here is 40600's installation environment, which loads successfully, albeit still with the network issue:

[image: J5QqTGt.png] https://freeimage.host/

I also want to mention that I proceeded with the 40600 installation using the fix, but when I tried to implement the fix again on the first boot, the system became unstable after running "sudo systemctl restart NetworkManager". By unusable, I mean it seemed like the command made the system hang without any error messages, and using CTRL+C or CTRL+ALT+DEL did nothing.

I hope this helps.

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3018#issuecomment-1878448070, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FPEAX7FVAAPRA6JYYTYM7IRFAVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZYGQ2DQMBXGA . You are receiving this because you were mentioned.Message ID: @.***>

jcm-01 commented 10 months ago

hmm this is a bit sad in that it means we can't enable NAPI by default because it breaks on the realtek nics ... even though it gives a nice perf boost for other nics :( I'm undoing all the tuning at this point -- we are not really able to only apply this for !realtek in how we do our tuning

Firstly, I want to express my gratitude for your prompt response. Although I only use Clear Linux for my home server, I truly appreciate all the time and effort invested in its development.

Now, on to the reason for my update. I noticed 40620 was released earlier, which I immediately downloaded and tested. The good news is that the installation loaded fine and was completed without any hiccups. However, after the first boot, the system will, unfortunately, stop responding quickly (less than a minute) and, leaving it long enough, messages similar to what K1ngfish3r experienced are shown:

J72t0Ij.png

Note that in the photo, I did try to check the network information, but the same messages are eventually shown even if I do not sign in.

fenrus75 commented 10 months ago

the next release is pending; you can go to it with

swupd update --format staging

On Fri, Jan 5, 2024 at 4:37 PM JC @.***> wrote:

hmm this is a bit sad in that it means we can't enable NAPI by default because it breaks on the realtek nics ... even though it gives a nice perf boost for other nics :( I'm undoing all the tuning at this point -- we are not really able to only apply this for !realtek in how we do our tuning

Firstly, I want to express my gratitude for your prompt response. Although I only use Clear Linux for my home server, I truly appreciate all the time and effort invested in its development.

Now, on to the reason for my update. I noticed 40620 was released earlier, which I immediately downloaded and tested. The good news is that the installation loaded fine and was completed without any hiccups. However, after the first boot, the system will, unfortunately, stop responding quickly (less than a minute) and, leaving it long enough, messages similar to what K1ngfish3r experienced are shown:

[image: J72t0Ij.png] https://freeimage.host/

Note that in the photo, I did try to check the network information, but the same messages are eventually shown even if I do not sign in.

— Reply to this email directly, view it on GitHub https://github.com/clearlinux/distribution/issues/3018#issuecomment-1879461144, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ54FKOWG33NTY6XGAA2MLYNCMEZAVCNFSM6AAAAABBMCANEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZZGQ3DCMJUGQ . You are receiving this because you were mentioned.Message ID: @.***>

jcm-01 commented 10 months ago

the next release is pending; you can go to it with swupd update --format staging

Hi,

The system hangs before swupd has time to do anything, so I attempted the update procedure you suggested from the installation environment by mounting the volume and running "swupd update --format staging --path=/mnt --statedir=/mnt/var/lib/swupd". This seemed to have finished successfully, after which I rebooted, but unfortunately, I still had the same issue.

I also noticed that I'm still on 40620, so I'm not sure if this is correct or if the update procedure failed. My assumption was that the update procedure would install 40620's successor.

Just to confirm, I downloaded and used "clear-40620-live-server.iso" to perform a clean install yesterday evening.

Thanks

K1ngfish3r commented 10 months ago

Should still be pending, https://cdn.download.clearlinux.org/releases/ doesn't seem to have any updates yet

clrlinux@clr-live~ $ sudo cryptsetup open /dev/nvme0n1p2 root
Enter passphrase for /dev/nvme0n1p2: 
clrlinux@clr-live~ $ sudo mount /dev/mapper/root /mnt
clrlinux@clr-live~ $ swupd update --format=staging --path=/mnt
Error: This program must be run as root..aborting

clrlinux@clr-live~ $ sudo !!
sudo swupd update --format=staging --path=/mnt
Update started
Version on server (40620) is not newer than system version (40620)
Update complete - System already up-to-date at version 40620
lhilden commented 10 months ago

@fenrus75 40620 is working for me--thank you! I tested it for a day and didn't run into the NIC hang issue under heavy load.

To work around this issue I initially downgraded to 40480 (with kernel 6.6.7) using sudo swupd repair -m 40480 --force and that worked great for several days while a fix was pending. Then I tested your fix by upgrading from 40480 to 40620 as follows:

$ sudo swupd update
Update started
Preparing to update from 40480 to 40620
Downloading packs for:
 - webkitgtk
 - audio-pipewire
 - dav1d-lib
 - not-ffmpeg-lib
 - desktop-gnomelibs
 - gnome-base-libs
 - LibRaw-lib
 - NetworkManager
 - aspell
 - binutils
 - bison
 - btrfs-progs
 - c-basic
 - cloud-api
 - cloud-control
 - containers-basic
 - curl
 - dev-utils
 - dnf
 - docker-compose
 - dpdk
 - editors
 - emacs
 - fontconfig
 - gnupg
 - gstreamer
 - harfbuzz-lib
 - inotify-tools
 - iptables
 - kernel-native
 - kvm-host
 - lib-opengl
 - lib-poppler
 - libX11client
 - libglib
 - libssh-lib
 - libstdcpp
 - linux-firmware
 - linux-firmware-extras
 - linux-firmware-wifi
 - linux-tools
 - llvm
 - lsof
 - mail-utils
 - minicom
 - network-basic
 - nfs-utils
 - notmuch
 - openblas
 - openssh-client
 - openssh-server
 - openssl
 - os-core
 - os-core-plus
 - os-core-update
 - package-utils
 - parallel
 - perl-basic
 - polkit
 - pypi-cython
 - pypi-numpy
 - pypi-pynacl
 - python3-basic
 - qt-basic
 - shells
 - storage-utils
 - stress-ng
 - sysadmin-basic
 - tzdata
 - vim
 - vte-lib
 [100%]

Finishing packs extraction...

Statistics for going from version 40480 to version 40620:

    changed bundles   : 65
    new bundles       : 4
    deleted bundles   : 0

    changed files     : 4264
    new files         : 16891
    deleted files     : 2657

Validate downloaded files
 [100%]

Starting download of remaining update content. This may take a while (9909 files)...
 [100%]

Installing files...
 [100%]

Update was applied
Calling post-update helper scripts
External command: none
External command: pacdiscovery.service: restarted (the binary was updated)
External command: tallow.service: restarted (the binary was updated)
External command: pacrunner.service: restarted (the binary was updated)
External command: systemd-journald.service: restarted (the binary was updated)
External command: systemd-resolved.service: restarted (the binary was updated)
External command: (Took 6 seconds)
External command: systemd-timesyncd.service: restarted (the binary was updated)
Update took 162.8 seconds, 994 MB transferred
9782 files were not in a pack
Update successful - System updated from version 40480 to version 40620
K1ngfish3r commented 10 months ago

Replying to https://github.com/clearlinux/distribution/issues/3018#issuecomment-1879461144

As per @lhilden I tested out fresh installing 40620, and it works. You should currently be on kernel 6.6.10 instead of 6.6.9 like your photo indicates

i@clr~ $ uname -r
6.6.10-1398.native
i@clr~ $ swupd info
Distribution:      Clear Linux OS
Installed version: 40620
Version URL:       https://cdn.download.clearlinux.org/update
Content URL:       https://cdn.download.clearlinux.org/update
jcm-01 commented 10 months ago

Replying to #3018 (comment)

As per @lhilden I tested out fresh installing 40620, and it works. You should currently be on kernel 6.6.10 instead of 6.6.9 like your photo indicates

Hello,

Thanks for pointing that out; I'm not sure why that was the case.

Following the comment by lhilden, I installed an older version (40580) and then proceeded to upgrade to 40620, which resolved all my issues.

To confirm, for some reason, a clean installation of 40620 didn't work, and neither did a clean installation of 40600 with an upgrade to 40620.