Open kimocoder opened 11 months ago
Invalid Version reported. v23.05.2
Is this from a clean repository?
Invalid Release reported. v23.05.2
Is this from a clean repository?
Invalid Target/Subtarget reported. Buffer Overflow
Is this from a supported device?
Related: https://github.com/openwrt/openwrt/issues?q=is%3Aissue+is%3Aopen+00005aed
and many more other places. @pepe2k get your bot away :pager: he better go and look at the "HwInfo" code :1st_place_medal:
Hi @kimocoder
and many more other places. @pepe2k get your bot away 📟 he better go and look at the "HwInfo" code 🥇
I really have no idea why you mentioned me here. If you are reporting hwinfo
package related issue, I believe this one has maintainer (check PKG_MAINTAINER
in Makefile
) you should have mentioned instead.
Cheers, Piotr
I don't understand why hwinfo
would cause such an issue? It's not installed by default, userspace, and I don't see why anyone on an embedded machine would have it installed in the first place as it seems useful only on desktops/etc. Are you referring to this package: https://github.com/openwrt/packages/tree/master/utils/hwinfo?
hwinfo package perceived issue is not related to mt76 wifi chip (some or all) hanging under high noise environment. You can report shift issue upstream of package https://github.com/openSUSE/hwinfo/issues , all current C compilers maintain bit shifting by n-1 though changing int to uint in particular place will address static checker WARNING.
Sorry I didnt time to do a proper report yesterday, but yeah this was a bit weird but all logs and "timeouts" etc I traced back to HwInfo. I can post it when I get back again tomorrow.
Besides, I did several STABLE ([23.05.2]) and Snapshot, with all from clean to dirty (compiled various arians stripped down to bare minimum with same results.
And since I DO have a lot of code in general + many various local 'feeds' laying around I got a hit on a few files pointing to hWInfo. Present in only HwInfo.
I can post it when I get back home, cause I also found all this very weird first 👍
First; First was the timeout dmesg several different chips report about is what I posted above. But again, is this enough to break the radios? which again breaks the UI if tou try enable them? Well... it's not a 'request' sent by the hour, it is constant and .
So, since I had the full sources laying around I found it in search locally (simply loving 'grep -or)
Two; Another pointing to HwInfo, not directly related to my first 'search' related to actually try fix or test the weird though pointed me to; https://github.com/openSUSE/hwinfo/tree/master/src/ids for some reason :+1: That dont do much though, but I cathed it unrelated to first find.
Third search in more debug code, pointed me to HwInfo's file "check_hd.c" and funny seen issues with that 'ccheck_hd" before (^^.)(.
And I want my wifi internal radios to work properlyly so I always on a rescue missio, but I realized that it was imported feedly and rather hard to do much fixing myself even if I got 3points to thatcausing the issues.
So instead, I fired ups ome E's to see what's in there, some tests just to see what's up with all hints to THAT causing the issues, and WHY and HOW.. like some of us do wanna know. And when I opened I understood that here is alotta old code, CVE/CWE warnings and 'buffers' laying around there. (not surpriused, since code was 20+ years old I kinda expedted it).
But again.. it's not enough for me to PINPOINT that it's main issue though. If I had the chance to just remove it, locally, easy... no probnlem I could be certain. But it's integrated imported through 'feeds' and I could think of a good way to do it without fiddling with OpenWRT source, c I simply ran out of time, even though I had some hours trynna figure this out.
My router now borrows a few ATH10K until I find a solution soon. But a test of it would be the best, OPF COURCE.
Now I will get back home tomorrow, I did document cvarious thingg so we can look at it together or so? Would be nice to hear from those others posted about the same or - related issues I've seen around here and there :1st_place_medal:
this is not a place to report a bug with package hwinfo
, it has no impact whatsoever on any wifi function. check ../packages.
mt76 radios are known to hang in very noisy environments but there is no repeater, what you get is commands to radio timing out when it is locked up already. need cause not consequence. check ../mt76
And tune down carriage a bit. Repeating same does not make it correct.
Describe the bug
There are numerous reports on issues on a wide range of chip sets for some time now and I went deep to try to find it today. This issue causes various issues, depending on time and place and whenever a 'bugger overflow' may hit.
OpenWrt version
v23.05.2
OpenWrt release
v23.05.2
OpenWrt target/subtarget
Buffer Overflow
Device
Asus Tud AX4200
Image kind
Official downloaded image
Steps to reproduce
This issues has been reported for quite some time, its not a OpenWRT issue sort of. Just by booting up, not only the dmesg spam starts, but its drains memory in most cases. It looks like this ....
[ 8224.077720] mt798x-wmac 18000000.wifi: Message 00005aed (seq 6) timeout [ 8244.534901] mt798x-wmac 18000000.wifi: Message 00005aed (seq 7) timeout [10558.855491] mt798x-wmac 18000000.wifi: Message 00005aed (seq 8) timeout [10579.313155] mt798x-wmac 18000000.wifi: Message 00005aed (seq 9) timeout [10599.772151] mt798x-wmac 18000000.wifi: Message 00005aed (seq 10) timeout [10620.229491] mt798x-wmac 18000000.wifi: Message 00005aed (seq 11) timeout [10643.884032] mt798x-wmac 18000000.wifi: Message 00005aed (seq 12) timeout [10664.343215] mt798x-wmac 18000000.wifi: Message 00005aed (seq 13) timeout [10684.800679] mt798x-wmac 18000000.wifi: Message 00005aed (seq 14) timeout [10705.258506] mt798x-wmac 18000000.wifi: Message 00005aed (seq 15) timeout [10747.453103] mt798x-wmac 18000000.wifi: Message 00005aed (seq 1) timeout [10767.911420] mt798x-wmac 18000000.wifi: Message 00005aed (seq 2) timeout [10788.369607] mt798x-wmac 18000000.wifi: Message 00005aed (seq 3) timeout [10808.827026] mt798x-wmac 18000000.wifi: Message 00005aed (seq 4) timeout [10939.247405] mt798x-wmac 18000000.wifi: Message 00005aed (seq 5) timeout [10959.706727] mt798x-wmac 18000000.wifi: Message 00005aed (seq 6) timeout [10980.163772] mt798x-wmac 18000000.wifi: Message 00005aed (seq 7) timeout [11000.622211] mt798x-wmac 18000000.wifi: Message 00005aed (seq 8) timeout [11706.428915] mt798x-wmac 18000000.wifi: Message 00005aed (seq 9) timeout [11726.896151] mt798x-wmac 18000000.wifi: Message 00005aed (seq 10) timeout [11747.343086] mt798x-wmac 18000000.wifi: Message 00005aed (seq 11) timeout [11767.801207] mt798x-wmac 18000000.wifi: Message 00005aed (seq 12) timeout
And my memory consumption is high. It now affects my WEB UI and WIFI CHIPS barely works. So, lets start off... it was deep search to actually find the issue here.OpenWrt tools and scripts etc from various sites and imports, and the problem is that "HWINFO" has had issues for quite some time now. And since it's imported with "./scripts/feeds" rather then fully, it was hard to track down.
Look at this....
Now ... if we go to "HWINFO" and the file "hwinfo-21.71/src/ids/hd_ids.h" and look ...
To be exact, its down here actually. Spot on, what my dmesg revealed.
And we find "MediaTek" and many other OEMs in the code too, very messy. According to github, it was written for 25+ years ago, so imagine that.
It was only the small INCLUDE fille that sent me to their repo and investigated, and it's very, very old...
"check_hd.c" is only a part of the problem thoug. Its all very old and dirty.
Actual behaviour
Expected behaviour
N/A
Additional info
Well, it needs to be removed from the "feed"er as soon as possible as there are many affected, according to what I read about it in my "investigation".
Diffconfig
No response
Terms