intel / Intel-Linux-Processor-Microcode-Data-Files

Other
620 stars 68 forks source link

intel-ucode 20210608 release triggers iwlwifi Microcode SW error on 06-9e-0a #56

Open alexmurray opened 3 years ago

alexmurray commented 3 years ago

After updating to the latest release 20210608 in Ubuntu we have had a report that this caused iwlwifi to constantly restart in a loop and hence make wifi unusable in that case. Please see https://bugs.launchpad.net/ubuntu/+source/intel-microcode/+bug/1931540 for the full details.

esyr-rh commented 3 years ago

The ProcCpuinfoMinimal.txt file in the report says "microcode : 0xde", which doesn't seem to be correct, as revision 0xea is expected to be used for 06-9e-0a if microcode-20210608 is used.

esyr-rh commented 3 years ago

Ah, I see "To report this bug, I have downgraded to 3.20210216.0ubuntu0.21.04.1" now, disregard the previous comment.

whpenner commented 3 years ago

Is it only the microcode that is upgrade/downgraded to see the issue? Also, is there more detail on the "multiple users" affected?

BachoSeven commented 3 years ago

@whpenner I can confirm that I got microcode errors regarding iwlwifi just after the latest update to intel-ucode(version 0xea, archlinux, i7-7500U), this is the relevant journal log entry:

giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: regular scan timed out
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: Microcode SW error detected.  Restarting 0x2000000.
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: Start IWL Error Log Dump:
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: Status: 0x00000040, count: 6
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: Loaded firmware version: 36.ad812ee0.0 8000C-36.ucode
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000084 | NMI_INTERRUPT_UNKNOWN
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x000002F0 | trm_hw_status0
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | trm_hw_status1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0002438C | branchlink2
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00039C06 | interruptlink1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000126 | interruptlink2
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | data1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000080 | data2
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x07830000 | data3
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x56406C31 | beacon time
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x2294452A | tsf low
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000001 | tsf hi
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | time gp1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x2294A08A | time gp2
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000001 | uCode revision type
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000024 | uCode version major
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xAD812EE0 | uCode version minor
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000201 | hw version
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00489004 | board version
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x808BFE01 | hcmd
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00022000 | isr0
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00800000 | isr1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x08005802 | isr2
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00400080 | isr3
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | isr4
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x808AFB03 | last cmd Id
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | wait_event
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00008E4C | l2p_control
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | l2p_duration
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | l2p_mhvalid
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | l2p_addr_match
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0000008F | lmpm_pmg_sel
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x14100651 | timestamp
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00340010 | flow_handler
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: Start IWL Error Log Dump:
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: Status: 0x00000040, count: 7
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000070 | NMI_INTERRUPT_LMAC_FATAL
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | umac branchlink1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC0086B3C | umac branchlink2
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC008D930 | umac interruptlink1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC0083D08 | umac interruptlink2
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000800 | umac data1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC0083D08 | umac data2
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xDEADBEEF | umac data3
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000024 | umac major
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xAD812EE0 | umac minor
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC088628C | frame pointer
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC088628C | stack pointer
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00CC010D | last host cmd
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | isr status reg
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: IML/ROM dump:
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | IML/ROM error/state
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000003 | IML/ROM data1
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: Fseq Registers:
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x2CA4494C | FSEQ_ERROR_CODE
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xA8528020 | FSEQ_TOP_INIT_VERSION
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xEA146062 | FSEQ_CNVIO_INIT_VERSION
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0000A056 | FSEQ_OTP_VERSION
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xE3B1BE88 | FSEQ_TOP_CONTENT_VERSION
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xE3639711 | FSEQ_ALIVE_TOKEN
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xE4F156E1 | FSEQ_CNVI_ID
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0xAD4584B4 | FSEQ_CNVR_ID
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x03000000 | CNVI_AUX_MISC_CHIP
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0BADCAFE | CNVR_AUX_MISC_CHIP
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0BADCAFE | CNVR_SCU_SD_REGS_SD_REG_DIG_DCDC_VTRIM
giu 13 15:33:41 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0BADCAFE | CNVR_SCU_SD_REGS_SD_REG_ACTIVE_VDIG_MIRROR
hmh commented 3 years ago

Do we have any more data on this?

What I did notice is that it looks like neither reporter is using the latest iwlwifi firmware available at linux-firmware git ATM, but given the lack of changelogs, there's no way to know whether that would be relevant without actually trying the newest iwlwifi firmware to see if the problem goes away...

BachoSeven commented 3 years ago

@hmh I'm using the latest version of linux-firmware in my distribution repos, namely 20210511.7685cf4, which I installed on the 15th of may, so I would assume that the error was triggered by the later ucode update.

Will try switching to linux-firmware-git and see what happens.

esyr-rh commented 3 years ago

it looks like neither reporter is using the latest iwlwifi firmware available at linux-firmware git ATM

Why do you say so? It seems that the iwlwifi ucode API version is 36 in both cases, and API version 36 is the latest (at least in [0]) for both 8000C[1] and 8265[2].

[0] https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/?id=0f66b74b6267fce66395316308d88b0535aa3df2 [1] https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/issues/56#issuecomment-860930517 [2] https://launchpadlibrarian.net/543132832/journal-wifi-restart

hmh commented 3 years ago

I was looking at the reported git hash (not API version) of the firmware in the logs submitted with the reports, against the one in a fresh checkout of linux-firmware, and noticed they were different.

Evidently, I may have been mistaken about it. I am looking at it again now, if I got it wrong I will hide that comment to avoid confusing things (and edit this one)...

Edit1: $ git remote -v origin git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git (fetch) origin git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git (push) $ git pull Already up to date. $ git log -- iwlwifi-8000C-36.ucode iwlwifi-8265-36.ucode commit 56115b259807e0417f30ef84bc6d2093572e6901 Author: Luca Coelho luciano.coelho@intel.com Date: Wed Mar 10 12:18:19 2021 +0200 iwlwifi: update 8000 family firmwares Build number: N/A Revision: ca7b901d (8000C, 8265) Change-Id: I3cbc45672cb501fb52a32c396463bf3c8e55eef8 Signed-off-by: Luca Coelho luciano.coelho@intel.com ...

Since firmware revision ca7b901d is different from the firmware revision in the reported logs, I assumed the two reports were using out-of-date firmware. This assumes linux-firmware in git actually has the newest available iwlwifi dump at the moment, of course.

I don't have hardware here to run that firmware and check if it reports anything different from what is in the commit log "revision", though.

esyr-rh commented 3 years ago

Oh, you're right, revision ad812ee0 comes from the previous commit:

commit 346057dbe7c7595748e61b8a5b962e5c7316924b
Author:     Luca Coelho <luciano.coelho@intel.com>
AuthorDate: Fri Aug 23 07:35:14 2019 +0300
Commit:     Luca Coelho <luciano.coelho@intel.com>
CommitDate: Wed Oct 14 16:07:24 2020 +0300

    iwlwifi: update 3168, 7265D, 8000C and 8265 firmwares

    Build number: N/A
    Revision: 0bd893f3 (7265D, 3168)
              ad812ee0 (8000C, 8265)

    Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

Not sure why is there such a author/commit discrepancy, though, looks like a revert (cf. commit 2ae99744 author date):

commit 5ee1c7d65c26b90b796362ac1b5715435e2a1384
Author:     Luca Coelho <luciano.coelho@intel.com>
AuthorDate: Mon May 18 16:32:00 2020 +0300
Commit:     Luca Coelho <luciano.coelho@intel.com>
CommitDate: Tue May 19 09:36:28 2020 +0300

    iwlwifi: update and add new FWs from core50-70 and core52-81 releases

    Build numbers: Core_build_core50-70
                   Core_build_core52-81
    Revision: 79ff3ccf (8000C, 8265)
              8902351f (9000, 9260)
              c31ac674 and d9698065 (cc, Qu)

    Change-Id: I0acc730509d5627b90cea3601158228bc4e94c40
    Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit 2ae99744efc14e5329a551251e22b33213224f11
Author:     Luca Coelho <luciano.coelho@intel.com>
AuthorDate: Fri Aug 23 07:35:14 2019 +0300
Commit:     Luca Coelho <luciano.coelho@intel.com>
CommitDate: Mon May 18 15:50:38 2020 +0300

    iwlwifi: update FWs to core47-142 release

    Build number: Core_build_core47-142
    Revision: 09bd31e1 (7265D, 3168)
              952d9faa (8000C, 8265)
              ceaaecdc (9000, 9260)
              3e391d3e (cc, Qu)

    Change-Id: I81f0730763759f809808f9c9ef275b71100579a0
    Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit 40e4162adfc91390f6fbbd8269f9439832af1dde
Author:     Luca Coelho <luciano.coelho@intel.com>
AuthorDate: Fri Aug 23 07:35:14 2019 +0300
Commit:     Luca Coelho <luciano.coelho@intel.com>
CommitDate: Fri Aug 23 07:35:14 2019 +0300

    iwlwifi: update FWs to core45-152 release

    Build number: Core_build_core45-152
    Revision: 77d01142 (8000, 8265)
              6bf1df06 (9000, 9260)
              4fa0041f (cc, Qu)

    Signed-off-by: Luca Coelho <luciano.coelho@intel.com>

commit b5f09bb4f816abace0227d0f4e749859364cef6b
Author:     Luca Coelho <luciano.coelho@intel.com>
AuthorDate: Sat Jul 20 10:36:53 2019 +0300
Commit:     Luca Coelho <luciano.coelho@intel.com>
CommitDate: Sat Jul 20 10:57:41 2019 +0300

    iwlwifi: update FWs for 3168, 7265D, 9000, 9260, 8000, 8265 and cc

    Build number: Core_build_core43-159
    Revision: 62a39462 (3168, 7265D)
              77d01142 (8000, 8265)
              177b3e46 (9000, 9260, cc)

    Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
hmh commented 3 years ago

Well, no idea about the commit flux, maybe some regression was detected and fixed in the latest one... if there is some sort of errata, specification updates or changelog for iwlwifi, I have no idea where it could be.

The latest iwlwifi release is a relevant security update: it fixes some of FragAttack (the full fix also requires updated kernel drivers and kernel wifi stack AFAIK). Whatever the reason (and I have my own opinion on this, off-topic here), several distros have not picked up on it yet. Debian is now aware, so we should issue updates for that firmware shortly. I believe Ubuntu has been alerted as well if they were not working on it yet.

@BachoSeven : you might want to alert your distro to update wifi firmware from the latest in linux-firmware git as a security fix for FragAttack. That said, all you need to do to use the newer iwlwifi firmware is to copy the newer file(s) from here: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/ over the older one(s) in /lib/firmware (or /lib/firmware/ depending on your distro). Keep a copy of the old file or reinstall your distribution's firmware package to restore back to the distro firmware if it doesn't work.

Please do report back if the newer wifi firmware fixes the bad interaction with the new CPU microcode...

hmh commented 3 years ago

Any news on this? Does the newest iwlwifi firmware fix things ? Or maybe disabling Intel ME / AMT network connection over WiFi in BIOS/UEFI fixes it ?

BachoSeven commented 3 years ago

@hmh I switched to the git branch of linux-firmware for the last week and it didn't happen anymore, but even before it wasn't something I could reproduce consistently, it happened pretty much randomly so I can't say it's fixed.

As for your second suggestion, I have no idea how to try those(can a user even disable IME?) or if they are possible in my bios, didn't see related settings in it.

hmh commented 3 years ago

@BachoSeven thanks for the update!

Please drop a note here if you reproduce the issue again (which would mean the problem can still happen even with the latest iwlwifi firmware).

BachoSeven commented 3 years ago

@hmh It happened again yesterday while using the linux-firmware-git package(available in the Arch User Repository, builds from source):

giu 29 20:52:24 hyperversum kernel: iwlwifi 0000:02:00.0: regular scan timed out
giu 29 20:52:24 hyperversum kernel: iwlwifi 0000:02:00.0: Microcode SW error detected.  Restarting 0x2000000.

giu 29 20:52:24 hyperversum kernel: iwlwifi 0000:02:00.0: Status: 0x00000040, count: 6
giu 29 20:52:24 hyperversum kernel: iwlwifi 0000:02:00.0: Loaded firmware version: 36.ca7b901d.0 8000C-36.ucode
giu 29 20:52:24 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000084 | NMI_INTERRUPT_UNKNOWN
giu 29 20:52:24 hyperversum kernel: iwlwifi 0000:02:00.0: 0x000002F0 | trm_hw_status0
giu 29 20:52:24 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | trm_hw_status1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0002438C | branchlink2
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00039C22 | interruptlink1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00018438 | interruptlink2
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | data1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000080 | data2
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x07830000 | data3
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x9EC0162E | beacon time
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x791B121E | tsf low
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000001 | tsf hi
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | time gp1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x791C5ABA | time gp2
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000001 | uCode revision type
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000024 | uCode version major
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xCA7B901D | uCode version minor
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000201 | hw version
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00489004 | board version
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x80A9FE01 | hcmd
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00022080 | isr0
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00800000 | isr1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x08001802 | isr2
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00400080 | isr3
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | isr4
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x80A8FB03 | last cmd Id
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | wait_event
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00008E4C | l2p_control
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | l2p_duration
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | l2p_mhvalid
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | l2p_addr_match
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0000008F | lmpm_pmg_sel
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x10032207 | timestamp
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00340818 | flow_handler
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: Start IWL Error Log Dump:
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: Status: 0x00000040, count: 7
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000070 | NMI_INTERRUPT_LMAC_FATAL
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | umac branchlink1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC0086B3C | umac branchlink2
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC0083D08 | umac interruptlink1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC0083D08 | umac interruptlink2
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000800 | umac data1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC0083D08 | umac data2
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xDEADBEEF | umac data3
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000024 | umac major
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xCA7B901D | umac minor
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC088628C | frame pointer
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xC088628C | stack pointer
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0062019C | last host cmd
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | isr status reg
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: IML/ROM dump:
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000000 | IML/ROM error/state
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x00000003 | IML/ROM data1
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: Fseq Registers:
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x2CA44104 | FSEQ_ERROR_CODE
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xA8528020 | FSEQ_TOP_INIT_VERSION
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xE214C062 | FSEQ_CNVIO_INIT_VERSION
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0000A056 | FSEQ_OTP_VERSION
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xE3B1BC89 | FSEQ_TOP_CONTENT_VERSION
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xA3629711 | FSEQ_ALIVE_TOKEN
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xF4F156E1 | FSEQ_CNVI_ID
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0xAD4587B0 | FSEQ_CNVR_ID
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x03000000 | CNVI_AUX_MISC_CHIP
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0BADCAFE | CNVR_AUX_MISC_CHIP
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0BADCAFE | CNVR_SCU_SD_REGS_SD_REG_DIG_DCDC_VTRIM
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: 0x0BADCAFE | CNVR_SCU_SD_REGS_SD_REG_ACTIVE_VDIG_MIRROR
giu 29 20:52:25 hyperversum kernel: iwlwifi 0000:02:00.0: FW error in SYNC CMD STATISTICS_CMD
hmh commented 3 years ago

@BachoSeven : thanks for the update!

it is a pity the newest iwlwifi firmware available ATM in linux-firmware did not avoid the microcode+iwlwifi regression: it would have been straightforward to work around the issue by recommending everyone to also update the iwlwifi firmware...

stephand commented 3 years ago

I have got the same issue on kernel 5.12.15 with a Intel Corporation Centrino Advanced-N 6205. Trying both firmwares

18.168.6.1 6000g2a-6.ucode SHA1 sum: 1936ad5fe2551ac9d6551be0d85984c1f5cc5cf7 repeatedly restarts with the error above / below, and the older 17.168.5.3 build 42301 6000g2a-5.ucode SHA1 sum: 7cf41d55e6e7185d58e33e2d0828a3e075d7329e simply just dies

iwconfig:

wlp3s0 IEEE 802.11 ESSID:"snip"
Mode:Managed Frequency:5.22 GHz Access Point: 48:D3:snip Bit Rate=54 Mb/s Tx-Power=15 dBm
Retry short limit:7 RTS thr:off Fragment thr:off Encryption key:off Power Management:on Link Quality=47/70 Signal level=-63 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:1 Invalid misc:85 Missed beacon:0

Full error message:

Jul 11 13:56:37 d-allen kernel: [ 2691.368132] iwlwifi 0000:03:00.0: Microcode SW error detected. Restarting 0x2000000. Jul 11 13:56:37 d-allen kernel: [ 2691.368141] iwlwifi 0000:03:00.0: Loaded firmware version: 18.168.6.1 6000g2a-6.ucode Jul 11 13:56:37 d-allen kernel: [ 2691.368262] iwlwifi 0000:03:00.0: Start IWL Error Log Dump: Jul 11 13:56:37 d-allen kernel: [ 2691.368264] iwlwifi 0000:03:00.0: Status: 0x0000004C, count: 6 Jul 11 13:56:37 d-allen kernel: [ 2691.368266] iwlwifi 0000:03:00.0: 0x00000034 | NMI_INTERRUPT_WDG
Jul 11 13:56:37 d-allen kernel: [ 2691.368267] iwlwifi 0000:03:00.0: 0x00010050 | uPc Jul 11 13:56:37 d-allen kernel: [ 2691.368269] iwlwifi 0000:03:00.0: 0x0001001E | branchlink1 Jul 11 13:56:37 d-allen kernel: [ 2691.368270] iwlwifi 0000:03:00.0: 0x00010126 | branchlink2 Jul 11 13:56:37 d-allen kernel: [ 2691.368272] iwlwifi 0000:03:00.0: 0x0000D6BE | interruptlink1 Jul 11 13:56:37 d-allen kernel: [ 2691.368273] iwlwifi 0000:03:00.0: 0x00024B6C | interruptlink2 Jul 11 13:56:37 d-allen kernel: [ 2691.368275] iwlwifi 0000:03:00.0: 0x00000002 | data1 Jul 11 13:56:37 d-allen kernel: [ 2691.368276] iwlwifi 0000:03:00.0: 0x07030000 | data2 Jul 11 13:56:37 d-allen kernel: [ 2691.368277] iwlwifi 0000:03:00.0: 0x0000E774 | line Jul 11 13:56:37 d-allen kernel: [ 2691.368279] iwlwifi 0000:03:00.0: 0xA30012B6 | beacon time Jul 11 13:56:37 d-allen kernel: [ 2691.368280] iwlwifi 0000:03:00.0: 0x5153ED49 | tsf low Jul 11 13:56:37 d-allen kernel: [ 2691.368282] iwlwifi 0000:03:00.0: 0x00000005 | tsf hi Jul 11 13:56:37 d-allen kernel: [ 2691.368283] iwlwifi 0000:03:00.0: 0x00000000 | time gp1 Jul 11 13:56:37 d-allen kernel: [ 2691.368284] iwlwifi 0000:03:00.0: 0x04616560 | time gp2 Jul 11 13:56:37 d-allen kernel: [ 2691.368286] iwlwifi 0000:03:00.0: 0x00000000 | time gp3 Jul 11 13:56:37 d-allen kernel: [ 2691.368287] iwlwifi 0000:03:00.0: 0x754312A8 | uCode version Jul 11 13:56:37 d-allen kernel: [ 2691.368288] iwlwifi 0000:03:00.0: 0x000000B0 | hw version Jul 11 13:56:37 d-allen kernel: [ 2691.368290] iwlwifi 0000:03:00.0: 0x00488700 | board version Jul 11 13:56:37 d-allen kernel: [ 2691.368291] iwlwifi 0000:03:00.0: 0x0BC0001C | hcmd Jul 11 13:56:37 d-allen kernel: [ 2691.368293] iwlwifi 0000:03:00.0: 0xA7E63002 | isr0 Jul 11 13:56:37 d-allen kernel: [ 2691.368294] iwlwifi 0000:03:00.0: 0x1141E000 | isr1 Jul 11 13:56:37 d-allen kernel: [ 2691.368295] iwlwifi 0000:03:00.0: 0x00000F1F | isr2 Jul 11 13:56:37 d-allen kernel: [ 2691.368297] iwlwifi 0000:03:00.0: 0x8143FCC0 | isr3 Jul 11 13:56:37 d-allen kernel: [ 2691.368298] iwlwifi 0000:03:00.0: 0x00000000 | isr4 Jul 11 13:56:37 d-allen kernel: [ 2691.368299] iwlwifi 0000:03:00.0: 0x10804112 | isr_pref Jul 11 13:56:37 d-allen kernel: [ 2691.368301] iwlwifi 0000:03:00.0: 0x0000E774 | wait_event Jul 11 13:56:37 d-allen kernel: [ 2691.368302] iwlwifi 0000:03:00.0: 0x000000B4 | l2p_control Jul 11 13:56:37 d-allen kernel: [ 2691.368304] iwlwifi 0000:03:00.0: 0x000000AC | l2p_duration Jul 11 13:56:37 d-allen kernel: [ 2691.368305] iwlwifi 0000:03:00.0: 0x0000000F | l2p_mhvalid Jul 11 13:56:37 d-allen kernel: [ 2691.368306] iwlwifi 0000:03:00.0: 0x001050C6 | l2p_addr_match Jul 11 13:56:37 d-allen kernel: [ 2691.368308] iwlwifi 0000:03:00.0: 0x00000005 | lmpm_pmg_sel Jul 11 13:56:37 d-allen kernel: [ 2691.368309] iwlwifi 0000:03:00.0: 0x06061222 | timestamp Jul 11 13:56:37 d-allen kernel: [ 2691.368310] iwlwifi 0000:03:00.0: 0x00000010 | flow_handler Jul 11 13:56:37 d-allen kernel: [ 2691.368387] iwlwifi 0000:03:00.0: Start IWL Event Log Dump: display last 1 entries Jul 11 13:56:37 d-allen kernel: [ 2691.368405] iwlwifi 0000:03:00.0: EVT_LOGT:0073491800:0x0000010c:0123 Jul 11 13:56:37 d-allen kernel: [ 2691.392653] iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0 Jul 11 13:56:37 d-allen kernel: [ 2691.693183] iwlwifi 0000:03:00.0: Radio type=0x1-0x2-0x0

stephand commented 3 years ago

Downgrading my kernel back to Ubuntu's standard 5.8.0-59 makes this disappear on the new firmware for me, so my piece does not seem to be a firmware issue; I'll bisect kernel versions and open a report on kernel.org

stephand commented 3 years ago

Spoke too soon, seems like the reboot was what helped and a suspend-to-ram cycle made the problem appear again on 5.8.0. I will see if any of the tricks (iwlmvm power_scheme=1, lower ucode) help prevent the problem after a reboot and subsequent s2ram cycles.

esyr-rh commented 3 years ago

I wonder if it is possible to narrow the issue down some way; may I ask those who is experiencing the issue, provide the DMI information (it may be the case that it is related to specific vendor's system firmware's PM implementation, or something like that)? It can be obtained with the following command: grep '.*' /sys/devices/virtual/dmi/id/*_* | column -t -s:. Thank you.

stephand commented 2 years ago

My DMI info:

$ sudo grep '.*' /sys/devices/virtual/dmi/id/*_* | column -t -s:
[sudo] password for syon: 
/sys/devices/virtual/dmi/id/bios_date            09/24/2019
/sys/devices/virtual/dmi/id/bios_release         2.77
/sys/devices/virtual/dmi/id/bios_vendor          LENOVO
/sys/devices/virtual/dmi/id/bios_version         G2ETB7WW (2.77 )
/sys/devices/virtual/dmi/id/board_asset_tag      Not Available
/sys/devices/virtual/dmi/id/board_name           2324FV6
/sys/devices/virtual/dmi/id/board_serial         <snip>
/sys/devices/virtual/dmi/id/board_vendor         LENOVO
/sys/devices/virtual/dmi/id/board_version        Not Defined
/sys/devices/virtual/dmi/id/chassis_asset_tag    No Asset Information
/sys/devices/virtual/dmi/id/chassis_serial       R9WG2M0
/sys/devices/virtual/dmi/id/chassis_type         10
/sys/devices/virtual/dmi/id/chassis_vendor       LENOVO
/sys/devices/virtual/dmi/id/chassis_version      Not Available
/sys/devices/virtual/dmi/id/ec_firmware_release  1.10
/sys/devices/virtual/dmi/id/product_family       ThinkPad X230
/sys/devices/virtual/dmi/id/product_name         2324FV6
/sys/devices/virtual/dmi/id/product_serial       <snip>
/sys/devices/virtual/dmi/id/product_sku          LENOVO_MT_2324
/sys/devices/virtual/dmi/id/product_uuid         <snip>
/sys/devices/virtual/dmi/id/product_version      ThinkPad X230
/sys/devices/virtual/dmi/id/sys_vendor           LENOVO
stephand commented 2 years ago

I have found that the issue disappears for me when I do:

$ cat /etc/modprobe.d/iwlwifi.conf
#....
options iwlwifi 11n_disable=1

via a comment in https://askubuntu.com/questions/675352/wireless-disconnects-intermittently-with-intel-corporation-centrino-advanced-n-6

BachoSeven commented 2 years ago

grep '.' /sys/devices/virtual/dmi/id/_* | column -t -s:

Here you go:

/sys/devices/virtual/dmi/id/bios_date          04/18/2019
/sys/devices/virtual/dmi/id/bios_release       5.12
/sys/devices/virtual/dmi/id/bios_vendor        American Megatrends Inc.
/sys/devices/virtual/dmi/id/bios_version       UX310UQK.311
/sys/devices/virtual/dmi/id/board_asset_tag    ATN12345678901234567
/sys/devices/virtual/dmi/id/board_name         UX310UQK
/sys/devices/virtual/dmi/id/board_serial       N0CV1715MB0024881
/sys/devices/virtual/dmi/id/board_vendor       ASUSTeK COMPUTER INC.
/sys/devices/virtual/dmi/id/board_version      1.0       
/sys/devices/virtual/dmi/id/chassis_asset_tag  No Asset Tag
/sys/devices/virtual/dmi/id/chassis_serial     H4N0CV048885155
/sys/devices/virtual/dmi/id/chassis_type       10
/sys/devices/virtual/dmi/id/chassis_vendor     ASUSTeK COMPUTER INC.
/sys/devices/virtual/dmi/id/chassis_version    1.0       
/sys/devices/virtual/dmi/id/product_family     ZenBook
/sys/devices/virtual/dmi/id/product_name       UX310UQK
/sys/devices/virtual/dmi/id/product_serial     H4N0CV048885155
/sys/devices/virtual/dmi/id/product_sku        
/sys/devices/virtual/dmi/id/product_uuid       5a8b4aa0-db09-9248-8c8a-c15f8bf1fc30
/sys/devices/virtual/dmi/id/product_version    1.0       
/sys/devices/virtual/dmi/id/sys_vendor         ASUSTeK COMPUTER INC.
BachoSeven commented 2 years ago

I have found that the issue disappears for me when I do:

$ cat /etc/modprobe.d/iwlwifi.conf
#....
options iwlwifi 11n_disable=1

via a comment in https://askubuntu.com/questions/675352/wireless-disconnects-intermittently-with-intel-corporation-centrino-advanced-n-6

Interesting, personally I have options iwlwifi 11n_disable=8, I remember deciding on this value after reading about the various options in the Arch Wiki but I don't remember why now

hmh commented 2 years ago

@tu-maurice, could you please hide your last three comments since it is a different issue? Just so someone doesn't get confused if they read it fast and don't notice your last comment...

tu-maurice commented 2 years ago

@hmh I wasn't entirely sure whether maybe the others could've confirmed my observations, but okay. I was never here.

hmh commented 2 years ago

@tu-maurice: if you can link your issue to a microcode update (hint: boot Linux with the dis_ucode_ldr parameter in the kernel command line/grub -- if the issue disappears, the chances are very high that the microcode update is the culprit), you are very welcome to open a new bug with the correct processor signature and full details...

whpenner commented 2 years ago

Please look at the following site - there is some data collection that can help us debug this issue. https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging

admnd commented 2 years ago

For what it worth, I had this exact crash but with an on-board AX 210 WiFi NIC (iwlwifi loads the microcode ty-a0-gf-a0-71.ucode).

In my case it seems to trace back to some power management issue as setting iwlmvm.power_scheme=1 (and only this, nothing disabled or overridden in iwlwifi) make the issue disappear. My WiFi connection is now stable for more than 8 hours without any glitch of any kind.

AFAK, setting something for 11n_disable will make the NIC ignore the 5 GHz band and every speed benefit it brings. Some people around have also reported more stability by disabling the 40 MHz channel width (leaving only channels of 20 MHz width).

Kernel 5.18.9 with Gentoo patches applied but it should change nothing at all and a vanilla kernel should exhibit the very same behavior.

marius-nicolae commented 1 year ago

What is iwlmvm? On my Ubuntu 22.04 I have a iwldvm kernel module, which depends on iwlwifi but it doesn't have any power_scheme param:

modinfo iwldvm
filename:       /lib/modules/5.15.0-57-generic/kernel/drivers/net/wireless/intel/iwlwifi/dvm/iwldvm.ko
license:        GPL
author:         Intel Corporation <linuxwifi@intel.com>
description:    Intel(R) Wireless WiFi Link AGN driver for Linux
srcversion:     664A3011326E0E0017CBCC4
depends:        mac80211,iwlwifi,cfg80211
retpoline:      Y
intree:         Y
name:           iwldvm
vermagic:       5.15.0-57-generic SMP mod_unload modversions 
sig_id:         PKCS#7
signer:         Build time autogenerated kernel key
sig_key:        36:8C:0C:1D:21:E1:5C:15:CA:0E:49:32:76:39:CA:DA:7D:EC:2A:58
sig_hashalgo:   sha512
signature:      09:4E:E8:2A:52:79:12:10:72:76:7A:1C:41:8C:D8:BA:E2:0C:C2:B5:
                15:0B:CC:39:9E:78:BD:C6:BE:10:50:0B:88:8E:F9:E5:B5:D3:48:19:
                92:89:53:6F:54:1B:0D:E7:B5:6C:74:5F:B4:62:EC:6F:9E:A2:6C:E7:
                DA:01:49:58:15:10:C6:42:D0:19:DE:E3:10:86:89:5C:F1:B3:16:2C:
                85:09:31:49:E3:C9:16:39:68:0A:78:35:50:71:D7:FD:A9:A3:22:63:
                C6:96:AC:D7:AE:B7:D5:25:D3:96:64:1C:04:61:75:CF:DD:65:38:61:
                8B:BC:00:B5:1D:44:11:B9:00:91:F5:78:37:9D:66:5A:1D:32:52:08:
                84:5F:50:9C:51:41:BD:96:6C:0B:37:17:5C:D2:71:70:C0:09:12:9C:
                17:40:4D:ED:22:4D:07:B8:5F:6B:5E:77:81:71:6B:3F:2A:0F:FB:F3:
                93:5D:7F:BD:57:EA:EE:C6:7B:B1:2A:FD:33:BF:C1:34:2B:D2:E3:B4:
                C5:CE:FE:92:DE:F4:CB:30:C4:CA:6B:2E:1A:59:60:2E:44:CA:73:AD:
                95:AF:15:8D:57:9A:F5:5C:9F:5E:A9:8D:91:77:D3:AE:1A:7C:C5:AB:
                11:E3:DF:C1:33:49:00:E9:55:DB:26:6A:12:53:AC:BC:08:B0:47:99:
                CB:40:3F:81:24:6B:91:A5:C9:70:34:FD:B2:3B:B3:58:B2:3F:1E:85:
                12:DE:31:C7:B0:2C:7A:88:0C:66:37:5D:1D:7D:DC:57:D6:40:D0:1C:
                EA:A8:3B:10:08:3C:93:18:D2:EC:FD:FF:2E:FB:6B:0F:DF:E1:D4:BF:
                2A:BC:D7:66:1A:80:49:88:56:F7:70:A5:0F:C1:46:31:CC:D9:65:37:
                C4:DF:9D:7F:ED:CD:E9:32:66:0E:55:93:7D:80:54:4C:D7:03:8E:D8:
                74:B6:7B:51:61:96:8D:A4:A3:42:EE:7D:1A:02:F7:F1:C2:29:AF:27:
                EC:D0:77:30:4E:99:97:12:E7:AE:2B:52:0A:5F:C0:E5:06:08:EB:81:
                C1:79:AA:82:3F:D4:E4:1A:25:5A:F0:37:8F:36:07:21:C8:27:B4:03:
                9B:2B:BB:57:EC:01:2A:BA:E0:28:2A:73:4E:E0:3A:E1:C9:0B:6F:3C:
                E8:77:97:CD:34:DC:21:DC:55:A5:AE:DF:5F:76:0C:DC:A2:03:5C:3F:
                9B:0D:69:10:62:DC:51:7E:06:45:10:F1:4D:19:72:55:85:36:FB:66:
                A0:BD:0A:5D:DA:3D:50:C0:A3:CB:34:6A:D3:28:2A:E0:C8:D6:EA:D4:
                55:F3:2E:9B:2F:52:69:19:A3:1C:CB:66
parm:           force_cam:force continuously aware mode (no power saving at all) (bool)
marius-nicolae commented 1 year ago

Ah, I now see, there is also a iwlmvm kernel module which has, indeed, a power_scheme param. Please ignore my previous comment!

marius-nicolae commented 6 months ago

I just wanted to share that it has worked rock solid, for two days now, with the previous firmware version, on my Ubuntu 22.04:

iwlwifi 0000:04:00.0: loaded firmware version 17.168.5.3 build 42301 6000g2a-5.ucode op_mode iwldvm

I've just removed the /lib/firmware/iwlwifi-6000g2a-6.ucode file and used dpkg-divert to prevent installing it, on future updates. Now, the driver is forced to use the previous firmware file - /lib/firmware/iwlwifi-6000g2a-5.ucode.