nakato / nixos-sbc

Nix Flake to make managing Single Board Computers easy and repeatable.
MIT License
26 stars 7 forks source link

bpi-r3: frequent wifi drops (probably due to VLAN) #14

Open steveej opened 3 months ago

steveej commented 3 months ago

i've been using the bpi-r3 with VLAN enabled for a while and have been seeing the AP disappear for several seconds alongside the following message:

Aug 04 11:22:02 router0-dmz0 kernel: mt798x-wmac 18000000.wifi: Message 000026ed (seq 14) timeout

relevant links:

as mentioned in one of the issues, a firmware update has been provided that probably fixes this: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=5d0d24b3b4207292dd2b1e927348899e10e06427

i'm not sure how to best apply this "feed" to the mtk repo that we use here. any ideas for this?

for now i'm using the workaround/patch described here which has been alleviating the problem for me. however i understand it's not a general solution long-term and might point to an underlying firmware issue.

nakato commented 3 months ago

i've been using the bpi-r3 with VLAN enabled for a while and have been seeing the AP disappear for several seconds alongside the following message:

Aug 04 11:22:02 router0-dmz0 kernel: mt798x-wmac 18000000.wifi: Message 000026ed (seq 14) timeout

relevant links:

as mentioned in one of the issues, a firmware update has been provided that probably fixes this: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=5d0d24b3b4207292dd2b1e927348899e10e06427

The firmware updated in that commit appears to be MT7916, which I don't think is used on this hardware as it's MT7986 paired with MT7975 wireless chips. As that patch is in a released linux-firmware, they're already available in the linux-firmware in nixpkgs.

There are 3 firmware files, the EEPROM, WA and WM. I'm somewhat surprised by the existence of the eeprom as originally the upstream DT was going to include mediatek,eeprom-data, but that was dropped and the suggested workaround of loading it from file was also rejected as users were seen copying them between hardware in an inappropriate manner and that needed to be avoided. Comparing the linux-firmware contents of mt7986_eeprom_mt7975_dual.bin to what we have in DT, the upstream one has a lot of data zero'ed out. Maybe the zeroed data is the "training data" that is specific to this hardware. Either way, it looks like the eeprom data can stay the way it is during any testing.

mt7986_wm_mt7975.bin and mt7986_wa.bin haven't been updated, so there's nothing to apply from linux-firmware upstream.

i'm not sure how to best apply this "feed" to the mtk repo that we use here. any ideas for this?

By "feed" do you mean https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/ ?

If you want to try the mt7986_wm_mt7975.bin and mt7986_wa.bin from there, add the following to your systems modules section. ~I haven't booted with this, but I did make sure the output has the mtk binaries instead of the upstream ones.~ I've booted with this one.

({lib, pkgs, ...}: { 
  # mkBefore is very important here, otherwise it won't be used over linux-firmware files.
  hardware.firmware = lib.mkBefore [(pkgs.stdenvNoCC.mkDerivation { 
    name = "mtk-firmware";
    src = pkgs.fetchgit { 
      url = "https://git01.mediatek.com/openwrt/feeds/mtk-openwrt-feeds";
      rev = "0fdbc0e6d84bbc0216da2842a494bdf01f745c6c";
      hash = "sha256-IuIw6Hp9yqVVebqKwHayGBkCqf3e8+eTXVFmFSqgXho=";
      sparseCheckout = [ 
        "autobuild/autobuild_5.4_mac80211_release/package/kernel/mt76/src/firmware"
      ];
    };
    dontPatch = true;
    dontConfigure = true;
    dontBuild = true;
    dontFixup = true;
    installPhase = ''
      mkdir -p $out/lib/firmware/mediatek
      mv autobuild/autobuild_5.4_mac80211_release/package/kernel/mt76/src/firmware/* $out/lib/firmware/mediatek/
    '';
  })];
})

If that resolves the issue, it'll probably be worth hitting up the mailing list and attempting to get MTK to update the firmware in the upstream linux-firmware repo.

for now i'm using the workaround/patch described here which has been alleviating the problem for me. however i understand it's not a general solution long-term and might point to an underlying firmware issue.


I've just realised there's firmware in openwrt/mt76 as well, if you just want the firmware from there and not the out-of-tree driver, same as the mediatek one.

({lib, pkgs, ...}: { 
  # mkBefore is very important here, otherwise it won't be used over linux-firmware files.
  hardware.firmware = lib.mkBefore [(pkgs.stdenvNoCC.mkDerivation { 
    name = "openwrt-mtk-firmware";
    src = pkgs.fetchFromGitHub { 
      owner = "openwrt";
      repo = "mt76";
      rev = "5c5e685eb02844942d2f83196141282b856704db";
      hash = "sha256-SvaZkaT1LZVcvwQ0PDmsYZQM7QpDs7RP9kiCPOOvLg8=";
    };
    dontPatch = true;
    dontConfigure = true;
    dontBuild = true;
    dontFixup = true;
    installPhase = ''
      mkdir -p $out/lib/firmware/mediatek
      mv firmware/* $out/lib/firmware/mediatek/
    '';
  })];
})

If you want to build the out-of-tree mt76 driver, that's a bit more complex, and I haven't looked into that yet. Maybe before we attempting an out-of-tree driver, lets see if the firmware update does anything?