Closed dzzinstant closed 2 months ago
Please provide the output of the "logread" command after booting into config mode of an affected image.
@AiyionPrime Do you still have access to the working UniFi AC Mesh that you tested? Can you get a boot log from that device as well?
Can confirm we lost the AC-Mesh in our hackspace as well. The date it went missing coincides with being readded to the ath79-generic target.
https://meshviewer.darmstadt.freifunk.net/#!/en/map/f09fc2dec4c5
Another node in the ffda network went down at about the same time, also a Unifi AC Mesh device on the 'testing' branch.
Oh yeah, that's the one. Hi there! Thanks for reporting this issue.
Grabbed logread and this one is sitting in config mode as well.
https://gist.github.com/mweinelt/35d85f5803573396e805887ef72c367c
Grabbed logread and this one is sitting in config mode as well.
https://gist.github.com/mweinelt/35d85f5803573396e805887ef72c367c
Thanks. It seems that OpenWrt is missing an image size check - there are only 3 64K block free for the overlay.
from 64287-gnat: logread-unifi_ac_mesh-reverts_to_first_boot.log
@dzzinstant Todays testing firmware reduces the number of packages we install on the device, which works around this problem.
What is interesting to notice is, that with the new target(?) the nodes set different primary MAC addresses.
https://meshviewer.darmstadt.freifunk.net/#!/788a20f21ff6 https://meshviewer.darmstadt.freifunk.net/#!/788a20f01ff6
I think we have to recheck the MAC address on the device, if any.
@dzzinstant Todays testing firmware reduces the number of packages we install on the device, which works around this problem.
Works on my node. Thanks a lot!
What is interesting to notice is, that with the new target(?) the nodes set different primary MAC addresses.
https://meshviewer.darmstadt.freifunk.net/#!/788a20f21ff6 https://meshviewer.darmstadt.freifunk.net/#!/788a20f01ff6
I think we have to recheck the MAC address on the device, if any.
I guess that only became apparent because I went back to the original firmware. It might also be caused by a change in gluon/openwrt that happened a long time ago.
Not sure how well accessible yours is, but could you check what MAC address is on the label inside the lower compartment, where the LAN cable is connected? We need to get this right.
MAC address on the device label: 788A20F01FF6 (matching the new address)
Just leaving a note, I guess the problem might reappear for other communities or devices.
Summary
kernel
and rootfs
partitions considerably. The remaining flash space was insufficient for a working rootfs_data
partition. Excerpt from dmesg
output:
[ 0.333109] 7 fixed-partitions partitions found on MTD device spi0.0
[ 0.339677] Creating 7 MTD partitions on "spi0.0":
[ 0.344649] 0x000000000000-0x000000060000 : "u-boot"
[ 0.350659] 0x000000060000-0x000000070000 : "u-boot-env"
[ 0.357032] 0x000000070000-0x000000800000 : "firmware"
[ 0.366427] 2 uimage-fw partitions found on MTD device firmware
[ 0.372594] Creating 2 MTD partitions on "firmware":
[ 0.377741] 0x000000000000-0x000000210000 : "kernel"
[ 0.383744] 0x000000210000-0x000000790000 : "rootfs"
[ 0.389700] mtd: device 4 (rootfs) set to be root filesystem
[ 0.397324] 1 squashfs-split partitions found on MTD device rootfs
[ 0.403773] 0x000000760000-0x000000790000 : "rootfs_data"
[ 0.410191] 0x000000800000-0x000000f90000 : "kernel1"
[ 0.416318] 0x000000f90000-0x000000fb0000 : "bs"
[ 0.421969] 0x000000fb0000-0x000000ff0000 : "cfg"
[ 0.427787] 0x000000ff0000-0x000001000000 : "art"
[..]
[ 10.862669] Too few erase blocks (3)
[ 10.867711] Too few erase blocks (3)
[ 10.871614] mount_root: failed to mount -t jffs2 /dev/mtdblock5 /tmp/overlay: Invalid argument
[ 11.414596] Too few erase blocks (3)
[ 11.419254] mount_root: unable to set filesystem state
[ 11.424834] mount_root: switching to jffs2 overlay
[ 11.429947] mount_root: switching to jffs2 failed - fallback to ramoverlay
site.mk
, or a corresponding helper script like contrib/genpkglist.py
). This might be sufficient to mitigate the problem.
Example: https://git.darmstadt.ccc.de/ffda/firmware/site/-/commit/b1636a606dffcee534a7efe4de7b82486a204b37@dzzinstant please retry with the current master branch, optimizations for the needed flash size were merged in the last few days. in my case this helped. see #2501
We've reduced the site a while ago by excluding USB packages from the boards in question. I'll close this issue.
At noted by @dzzinstant, there should be an image size check in place. Reopening to keep track of fixing the check.
Sorry about the confusion, I also wasn't sure about whether this ticket should remain open. I have submitted a more specific bug report here: https://github.com/openwrt/openwrt/issues/9862
IMHO this ticket should remain open for now, because also gluon's functionality is affected. In particular, the bug may break the autoupdating procedure.
There have been unanswered questions upstream: https://github.com/openwrt/openwrt/issues/9862#issuecomment-1125833592
Can someone take a look?
Bug report
What is the problem? My node runs on the Freifunk Darmstadt variant of gluon, 'testing' branch. When the branch was recently updated from ffda 2.3~20210608 / gluon-v2020.2-263-g3f59fdc6 to ffda 2.5~20220330 / gluon-v2021.1-338-g55da2a7, my node no longer responded, and apparently was stuck in config mode.
Other updating methods led to the same result:
- revert to Ubiquiti's stock firmware (both 2017-05-08 and 2022-03-12), then install 2.5.* using
dd
- revert to stock firmware, install 2.4.1 (stable firmware from ffda). Then install 2.5.* (using both autoupdater and sysupgrade -n)
- (I set the first byte of the MTD partition
bs
to 0x00, I also tried with setting it to 0x01)I configured the node using the web interface. After pressing the "save & reboot" button, the node always reboots to config mode. All configuration data - including the node name - is deleted/reset to first installation state. This problem only appeared on a UniFi AC Mesh, my other devices accepted the rollout of 2.5.* without noticable problems.
What is the expected behaviour? Using the stable branch ffda 2.4.1 / gluon-v2021.1.1, the node works & reboots without problems. All configuration settings are preserved.
Node in question: 64287-gnat, older data of the same device: 64287-gnat (old)
Another node in the ffda network went down at about the same time, also a Unifi AC Mesh device on the 'testing' branch.
Possibly related issues
- Recent migration from ar71xx to ath79 target: ar71xx - ath79 migration progress #2413 ath79-generic: (re)add support for UniFi AC Mesh #2428
- There used to be a problem with the MTD partition
bs
being configured as read-only: ar71xx: UBNT UniFi AP-AC Mesh/Lite/Pro sysupgrade broken #1301 FS#662 - Sysupgrade messes up Ubiquiti Unifi AP AC Lite openwrt/openwrt#6561
could you do me a favor about this?https://forum.openwrt.org/t/need-a-copy-content-of-full-flash-about-uap-ac-m/140438
I think this was solved (openwrt requires 3 erase blocks to be available for rootfs on build afaik, which is 192kb), also we got a patch in master now that doubles the space used by Gluon https://github.com/freifunk-gluon/gluon/commit/cc854594b0ba677760b844f5e92f411658ba13d8
Apparently, this problem did not reappear for the UniFi AC Mesh.
To solve the problem in general (for all devices), it would be possible to reserve a minimum space needed for erase blocks during the image building process, e.g. in the targets' individual Makefiles openwrt/target/linux/<..>/image/<..>.mk
or by modifying the check-size
command in openwrt/include/image-commands.mk
.
Both options seem somewhat arbitrary and disproportionate to me (because the bug only appears very rarely and can easily be cured), and might also introduce new obscure bugs. So I am rather closing the corresponding bug reports for gluon and openwrt.
if it's only related to free erase blocks, then I ran into the same issue with an openmesh device. They have lots of flash. but use A/B partitioning and the jffs2 requires 1.25MiB free erase blocks which is quite a lot. (erase size is 256k)
I don't think this is quite done. but we will see
Bug report
What is the problem? My node runs on the Freifunk Darmstadt variant of gluon, 'testing' branch. When the branch was recently updated from ffda 2.3\~20210608 / gluon-v2020.2-263-g3f59fdc6 to ffda 2.5\~20220330 / gluon-v2021.1-338-g55da2a7, my node no longer responded, and apparently was stuck in config mode.
Other updating methods led to the same result:
dd
bs
to 0x00, I also tried with setting it to 0x01)I configured the node using the web interface. After pressing the "save & reboot" button, the node always reboots to config mode. All configuration data - including the node name - is deleted/reset to first installation state. This problem only appeared on a UniFi AC Mesh, my other devices accepted the rollout of 2.5.* without noticable problems.
What is the expected behaviour? Using the stable branch ffda 2.4.1 / gluon-v2021.1.1, the node works & reboots without problems. All configuration settings are preserved.
Node in question: 64287-gnat, older data of the same device: 64287-gnat (old)
Another node in the ffda network went down at about the same time, also a Unifi AC Mesh device on the 'testing' branch.
Possibly related issues
2413
2428
bs
being configured as read-only:1301
https://github.com/openwrt/openwrt/issues/6561