morrownr / USB-WiFi

USB WiFi Adapter Information for Linux
2.4k stars 161 forks source link

Firmware load failed for Bluetooth on M.2 Mediatek Wifi card #361

Closed silverfs closed 5 months ago

silverfs commented 5 months ago

I own an AMD Framework 13 laptop and since about 1.5 to 2 weeks now, Bluetooth stopped working, and I only had since this weekend to take a look at it. I run EndeavourOS with the 6.6.9-arch1-1 (64-bit) kernel. The laptop has a built-in "AMD RZ616 Wi-Fi 6E" as described on Framework's website. I've heard that this is a good place to ask about issues related to this.

The very start of this issue can be found here: https://community.frame.work/t/direct-firmware-load-failed-for-bluetooth-on-default-m-2-mediatek-wifi-card/43123/1

You can check out that link as it has all the commands and steps that I have already tried, so I'll keep it short here.

At boot, my system displays the message:

bluetooth hci0: Direct firmware load for mediatek/BT_RAM_CODE_MT7922_1_1_hdr.bin failed with error -2
kernel: Bluetooth: hci0: Failed to load firmware file (-2)
kernel: Bluetooth: hci0: Failed to set up firmware (-2)
kernel: Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported.

Bluetooth doesn't appear to see any drivers, which may make sense because it couldn't load the firmware. As seen in the link above, I've also tried downloading some .bin files to /lib/firmware/mediatek, but to no avail.

I found a similar message here, but I don't think it's necessarily relatable.

I don't really know what to do next. Does anyone have any suggestions? Any help is greatly appreciated. (If you think I should add more information or you have any comments about my issue, do tell me).

bjlockie commented 5 months ago

If you can boot from USB try a different distro and see if that works.

morrownr commented 5 months ago

@bjlockie

I think I have now seen this or almost this 3-4 times in the last few weeks. If memory serves me correctly, all reports seem to have been users that are using arch. It appears the arch devs have decided to compress the firmware files. I realize that support for compression for drivers in available but I was not aware that this compression extended to firmware files. In theory, it should work but theories do not always pass the test. Testing with a bootable USB driver would help confirm if the problem is a problem specific to the reporters system.

@silverfs

I noticed the following comment from you:

Currently, I have both BT_RAM_CODE_MT7922_1_1_hdr.bin and BT_RAM_CODE_MT7922_1_1_hdr.bin.zst as files in that directory.

That is interesting. I have not tested this but it feels wrong. Since the kernel can read compressed and uncompressed files, having both is like having 2 identical files as far as the kernel is concerned or so my theory goes.

Recommend you delete both files and go to my firmware guide and install the latest for the mt7922:

https://github.com/morrownr/USB-WiFi/blob/main/home/How_to_Install_Firmware_for_Mediatek_based_USB_WiFi_adapters.md

You are looking for section 2. MT7922

Let us know what happens.

silverfs commented 5 months ago

Thank you for the quick and kind response :)

I have removed previously downloaded firmware files from the 7961 variant and now correctly used the MT7922 firmware from section 2. I first tried it with just that, but it did not work after a reboot. I then tried to remove .zst variants of the same files from the previously downloaded firmware (MT7961), as well as the .zst's from the MT7922 ones from section two as well (I do not know why they were there. Could be that I should not have removed them, but it's too late now). After another system reboot, it did not work out either.

I find it quite weird that doing a journalctl again to find out that I now have the exact same file in /lib/firmware/mediatek that the kernel was asking for, but it still does not pick up that file.

> journalctl --dmesg --boot=-0 --grep blue
jan 08 16:43:48 lychee kernel: Bluetooth: Core ver 2.22
jan 08 16:43:48 lychee kernel: NET: Registered PF_BLUETOOTH protocol family
jan 08 16:43:48 lychee kernel: Bluetooth: HCI device and connection manager initialized
jan 08 16:43:48 lychee kernel: Bluetooth: HCI socket layer initialized
jan 08 16:43:48 lychee kernel: Bluetooth: L2CAP socket layer initialized
jan 08 16:43:48 lychee kernel: Bluetooth: SCO socket layer initialized
jan 08 16:43:48 lychee kernel: bluetooth hci0: Direct firmware load for mediatek/BT_RAM_CODE_MT7922_1_1_hdr.bin failed with error -2
jan 08 16:43:48 lychee kernel: Bluetooth: hci0: Failed to load firmware file (-2)
jan 08 16:43:48 lychee kernel: Bluetooth: hci0: Failed to set up firmware (-2)
jan 08 16:43:48 lychee kernel: Bluetooth: hci0: HCI Enhanced Setup Synchronous Connection command is advertised, but not supported.
jan 08 16:43:48 lychee kernel: Bluetooth: BNEP (Ethernet Emulation) ver 1.3
jan 08 16:43:48 lychee kernel: Bluetooth: BNEP filters: protocol multicast
jan 08 16:43:48 lychee kernel: Bluetooth: BNEP socket layer initialized

Very weird. Please let me know if there's anything more I can do on my side.

morrownr commented 5 months ago

@silverfs

I'm on my main dev box right now. It happens to have a mt7922 based PCIe card. It uses the same driver and firmware that your little card does. It just tested with kernels 6.1 and 6.5 as that is what I have installed currently. I have an internet FM radio station sending music to my dev box as it has better speakers. Bluetooth is working well. My distro is Debian 12.

Suggestions to help find the problem:

@morrownr

silverfs commented 5 months ago

@morrownr

I don't think I have the useful way of going to a backup of the 6.5 kernel. Unfortunately no backups or snapshots. I did do some other stuff:

Is there maybe something I can do from the live-usb? And maybe doing something while I'm still on this kernel, or is it alright to shift back? Edit: I'll shift back for now.

morrownr commented 5 months ago

I booted endeavouros 2023/11 from a live-usb and bluetooth seemed to correctly load from there.

Kinda makes you want to scratch your head. You might want to post this issue at the forum of your distro as someone could have some knowledge that we are not aware of. I'll keep an eye out for a solution. Like I said, I'm running the same mt7922 chipset with the same firmware you downloaded so the only difference is that you are on kernel 6.6 and I am on kernels 6.1 and 6.5. Sometimes things happen but it is usually fixed within a reasonable amount of time.

On Debian and Ubuntu based systems, it is easy to reboot to a difference kernel. I'm not familiar with your distro but you might investigate to see if there is a better way than you are aware of. I don't know that the kernel is the problem but it happens at times and I can't think of anything else right now.

@morrownr

fhteagle commented 5 months ago

@silverfs

Fellow endeavor / Arch on Framework user (D.H on their community) here.

Have you tried installing an arch provided linux-lts kernel?

The firmware files come from Linux-firmware package IIRC. I might be able to dig up an archived version of that if you really wanted to try it.

Check permissions for files and folder in /lib/firmware/mediatek ? Should be available to root, but I'd check getfacl for any weird permissions also.

silverfs commented 5 months ago

@fhteagle

Thanks for your message. I have not tried that, and would be very nice if you could help me with that, although I hope I won't lose a lot of stuff while doing that. I have checked the permissions and it is available to root as you said:

> getfacl mediatek/
# file: mediatek/
# owner: root
# group: root
user::rwx
group::r-x
other::r-x

From the framework forum, I'm currently trying out some configs combinations with bluez and dbus-daemon-units vs dbus-broker-units, but I haven't noticed any differences yet. I'll keep you updated.

fhteagle commented 5 months ago

@silverfs -

Never fails, linux-lts kernel package in arch core repositories was just brought up to 6.6.10 , which is the same as the main kernel ( "linux" package ) version. So not worth installing linux-lts after all, as there's no difference really.

I highly doubt the dbus stuff is going to make a difference if the firmware is not getting loaded / activated onto the card correctly. Does the BT side of the card even show up in lsusb -tv ?

What are the dates on your firmware files in /lib/firmware/mediatek ? Here's what I have with linux-firmware package 20231211.f2e52a1c-1

> ls -al *7961*
-rw-r--r-- 1 root root 379496 Dec 11 07:23 BT_RAM_CODE_MT7961_1_2_hdr.bin.zst
-rw-r--r-- 1 root root  47903 Dec 11 07:23 WIFI_MT7961_patch_mcu_1_2_hdr.bin.zst
-rw-r--r-- 1 root root 436064 Dec 11 07:23 WIFI_RAM_CODE_MT7961_1.bin.zst

Try sudo pacman -S linux-firmware if your dates do not match the above.

silverfs commented 5 months ago

@fhteagle

Ah I see.

> lsusb -tv
/:  Bus 001.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/5p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 001: Dev 002, If 0, Class=Billboard, Driver=[none], 12M
        ID 32ac:0002
    |__ Port 001: Dev 002, If 1, Class=Human Interface Device, Driver=usbhid, 12M
        ID 32ac:0002
    |__ Port 004: Dev 003, If 0, Class=Vendor Specific Class, Driver=[none], 12M
        ID 27c6:609c Shenzhen Goodix Technology Co.,Ltd.
    |__ Port 005: Dev 004, If 0, Class=Wireless, Driver=btusb, 480M
        ID 0e8d:e616 MediaTek Inc.
    |__ Port 005: Dev 004, If 1, Class=Wireless, Driver=btusb, 480M
        ID 0e8d:e616 MediaTek Inc.
    |__ Port 005: Dev 004, If 2, Class=Wireless, Driver=[none], 480M
        ID 0e8d:e616 MediaTek Inc.
/:  Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/2p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
/:  Bus 003.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/1p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
/:  Bus 004.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/1p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
/:  Bus 005.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/1p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
/:  Bus 006.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/1p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
/:  Bus 007.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/1p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
/:  Bus 008.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/1p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub

Not sure, but I can count two instances of btusb here (3 wireless in total)? They have the same ID and name, so not sure about that one. As you can see down below, they do have the most recent version.

ls -al *7961*
-rw-r--r-- 1 root root 379496 11 dec 15:23 BT_RAM_CODE_MT7961_1_2_hdr.bin.zst
-rw-r--r-- 1 root root  47903 11 dec 15:23 WIFI_MT7961_patch_mcu_1_2_hdr.bin.zst
-rw-r--r-- 1 root root 436064 11 dec 15:23 WIFI_RAM_CODE_MT7961_1.bin.zst

Currently back on 6.6.10 kernel btw. How did you get your specific linux-firmware package version?

I guess I can try updating linux-firmware regardless. What do you think?

fhteagle commented 5 months ago

My working wifi + BT card shows up in triplicate in my lsusb -tv as well, with the same goofball driver=none entry at the end. Though keep in mind I'm on a mt7921k, not a mt7922, which uses a different driver, so all analogies are to be taken with a grain of salt.

How did you get your specific linux-firmware package version?

sudo pacman -Qs linux-firmware

What's your output of

  1. uname -a
  2. lsmod | grep -i bt
  3. lsmod | grep -i usb
  4. lspci | grep -i media

Use pastebin if they're long

Last dumb question, but was this a DIY build or a Framework did all the installs for you build? Have you tried removing and re-seating the card?

silverfs commented 5 months ago

Thanks. glad at least that's the same. I have the same linux-firmware package, so all is well over there.

I did not know you had another card, did you choose that one yourself or do you have an older model framework?

Anyway, here are the outputs of the following commands.

> uname -a
Linux lychee 6.6.10-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 05 Jan 2024 16:20:41 +0000 x86_64 GNU/Linux
> lsmod | grep -i bt
btusb                  86016  0
btrtl                  32768  1 btusb
btintel                57344  1 btusb
btbcm                  24576  1 btusb
btmtk                  12288  1 btusb
bluetooth            1114112  15 btrtl,btmtk,btintel,btbcm,bnep,btusb
> lsmod | grep -i usb
btusb                  86016  0
btrtl                  32768  1 btusb
btintel                57344  1 btusb
btbcm                  24576  1 btusb
btmtk                  12288  1 btusb
bluetooth            1114112  15 btrtl,btmtk,btintel,btbcm,bnep,btusb
usbhid                 77824  0
> lspci | grep -i media
01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter
c1:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] ACP/ACP3X/ACP6x Audio Coprocessor (rev 63)

Yes, it was a DIY 😄. However, the wireless card was not installed by me but already in the designated slot. I could try to re-seat the card but to be honest, I doubt that would do anything. WiFi works just fine, and Bluetooth worked up to 2 weeks ago.

fhteagle commented 5 months ago

I have two DIY 11th gen i5 Framework mainboards in play. Originally had AX210 installed, but switched to the mt7921k for better ability to make APs.

Kernel and module list matches, so yeah that's all the obvious stuff that I can think of.

If you are trying to troubleshoot more in the future, I would not keep the .bin and the corresponding .zst compressed firmware file in /lib/firmware/mediatek directory at the same time. I vaguely remember something going wonky when I tried that once, but that's been a year back so I cannot be more specific than that.

silverfs commented 5 months ago

Nice! And thanks for the tip. I removed the .bin files 2-3 days ago. I only have the .zst's left that I copied from the live-usb instance where it did work. Thanks for all the help so far!

silverfs commented 5 months ago

Got an update, this time a positive one!

Chatting some more in the Framework community forum, I tried some more stuff. You could read it there, but I'll paste the my last message here to let you guys know what I did.

Alright. I reinstalled the kernels and rebooted. Then I did the dracut command you mentioned which succeeded and I rebooted again, but to no avail. Next, I installed the linux-zen kernel, rebooted, but still did not work. I then rebooted into zen once again, did the dracut regenetation command there, rebooted back into the linux-zen kernel, and it seemed to be fixed! It could be a combination of having bluetooth.service having enabled before a reboot or not… I could not find that out.

I rebooted a few more times, and the fix seemed to be persistent. Next, I rebooted back into the default kernel, and the issue is gone there as well. I’m quite baffled, but I am glad it works now. I’ll monitor it for a few more days to see if any updates let it crash again.

As I mentioned, I'll keep you guys here updated as well in the upcoming days to see if anything borks up. Thank you all for your time so far.

silverfs commented 5 months ago

There are no further issues after a couple of days.

To anyone in the future who has the same problem: Try reading both the link of the Framework community post I linked above and this issue. Try installing installing the linux-zen kernel, (and maybe bluez/bluez-utils as well. Don’t know if it had any effect). After a reboot and it not being fixed yet, I highly suggest doing dracut --regenerate-all --force after reinstall-kernels. Then, before a reboot, enable bluetooth.service to be sure, because sometimes bluetooth state is read on boot to see if it needs to be on or off. That is all.

I consider this issue solved for me. If anyone has a similar problem, I’d be happy to troubleshoot with you. Thank you 😊

iotac01 commented 4 months ago

For people who finds this thread but for whom anything above doesn't work: what works for me at the end is rmmod btusb and then modprobe btusb (I have to do this at every boot though).