libremesh / lime-packages

LibreMesh packages configuring OpenWrt for wireless mesh networking
https://libremesh.org/
GNU Affero General Public License v3.0
281 stars 96 forks source link

shared-state-async not syncing bat-hosts #1135

Open pony1k opened 1 month ago

pony1k commented 1 month ago

I noticed that the file /var/bat-hosts does not exist, even though shared-state shared-state-bat_hosts is installed. Upon investigation I found that when I run

shared-state-async get bat-hosts

I get an error saying

Error relocating /usr/bin/shared-state-async: _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE15_M_replace_coldEPcjPKcjj: symbol not found

Maybe this has something todo with it. I get the same error with get net-stats. OpenWrt version: 23.05.5 LibreMesh version: master rev. 08a3948 20240924_1212 Some installed packages:

root@kh-bad:~# opkg list-installed | grep -F -e shared -e lime
lime-app - v0.2.26-r2
lime-proto-anygw - 2024-09-24-1727177548
lime-proto-batadv - git-24.268.41548-08a3948
lime-proto-bmx7 - git-24.268.41548-08a3948
lime-system - 2024-09-24-1727177548
shared-state - 2024-09-24-1727177548
shared-state-async - 2024-08-28-1724853226
shared-state-bat_hosts - 2024-09-24-1727177548
shared-state-dnsmasq_hosts - 2024-09-24-1727177548
shared-state-dnsmasq_leases - 2024-09-24-1727177548
shared-state-network_nodes - git-24.268.41548-08a3948-r1
shared-state-nodes_and_links - 2024-09-24-1727177548
ubus-lime-grondrouting - 2024-09-24-1727177548
ubus-lime-location - 2024-09-24-1727177548
ubus-lime-metrics - 2024-09-24-1727177548
ubus-lime-utils - 2024-09-24-1727177548
javierbrk commented 1 month ago

There seems to be a missing symbol related to string replacement. there seems to be an inconsistency between the shared state binary and the libraries that it uses.

I remember I had a similar issue when I tried to build shared-state in my computer and insert it in a running image. C++ stdlib and the binary must be compiled together in order to work. I don't know if this is the case.

What hardware are you using? Are you running an image from the repository? Can you recompile your image? What version of gcc are you using ? Can you upgrade it ?

Saludos Ing. Javier Alejandro Jorge

On Fri, 25 Oct 2024 at 12:33, Ilario Gelmetti @.***> wrote:

Assigned #1135 https://github.com/libremesh/lime-packages/issues/1135 to @javierbrk https://github.com/javierbrk.

— Reply to this email directly, view it on GitHub https://github.com/libremesh/lime-packages/issues/1135#event-14878421075, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABE2I2BZKCZWDH677H5BTRLZ5JQD7AVCNFSM6AAAAABQTONRLWVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJUHA3TQNBSGEYDONI . You are receiving this because you were assigned.Message ID: @.***>

pony1k commented 1 month ago

The images were built using imagebuilder with the https://feed.libremesh.org/arch_packages/master/ repos. Devices are

On E8450, the error looks slightly different: ...replace_coldEPcmPKcmm: symbol not found

I can try to reproduce the problem with a self compiled package, if that helps?

G10h4ck commented 1 month ago

I use the openwrt buildroot and haven't encountered this problem, so I guess it may be happening due to Image builder usage.

I usually use this script https://gitlab.com/librerouter/librerouteros/-/blob/main/librerouteros_build.sh?ref_type=heads to avoid jumping around make menuconfig but should be the same if you select the packages manually on openwrt buildroot

a-gave commented 1 month ago

Hi, I think I've reproduced it on a litebeam-m5-xw

It doesn't work with openwrt-23.05.5, libremesh feeds master, using imagebuilder

root@LiMe-07d1b5:~# shared-state-async get bat-hosts
Error relocating /usr/bin/shared-state-async: _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE15_M_replace_coldEPcjPKcjj: symbol not found

It works with openwrt snapshots, libremesh feeds master, using imagebuilder


root@LiMe-07d1b5:~# shared-state-async get bat-hosts
D 1729878728.079 std::task<int> SharedState::merge(const std::string&, const std::map<std::__cxx11::basic_string<char>, StateEntry>&, const sockaddr_storage&, std::error_condition*) bat-hosts got 7 significative changes out of 7 input slice size: 7 state size: 7
{
    "02:58:47:07:d1:b5": "LiMe_07d1b5_wlan0_mesh_29",
    "02:95:39:07:d1:b5": "LiMe_07d1b5_eth0_29",
    "0e:1e:c4:cf:27:27": "LiMe_07d1b5_bat0",
    "fa:ec:da:06:d1:b5": "LiMe_07d1b5_wlan0_mesh",
    "fc:ec:da:06:d1:b5": "LiMe_07d1b5_wlan0_ap",
    "fc:ec:da:07:d1:b5": "LiMe_07d1b5_br_lan",
    "fe:ec:da:06:d1:b5": "LiMe_07d1b5_wlan0_apname"
}
pony1k commented 1 month ago

For mt7621 the OpenWrt 23.05.5 repo contains

Whilst snapshot has

shared-state-async uses those libraries and I guess the LibreMesh builtbot links against the snapshot versions and there is some compatibility issue with one or more of these libraries.

a-gave commented 1 month ago

Just one more note, I'm thinking it is an error with the ci master feeds are compiled with openwrt-sdk master ARCH: "x86_64" so they are only compatible with openwrt snapshot

https://github.com/libremesh/lime-packages/blob/08a3948a5a80f6e26318c96227e02466bd09345e/.github/workflows/build.yml#L27C11-L27C25

we should probably create a feeds repo with binaries built from openwrt sdk v23.05.5 or branch-23.05 to be compatible with openwrt-23.05.5 and current builds of libremesh-2024.1-rc1 as had been done for openwrt v19 ARCH: "x86_64-19.07.10"

https://github.com/libremesh/lime-packages/blob/08a3948a5a80f6e26318c96227e02466bd09345e/.github/workflows/build_2020.yml#L19C11-L19C34

G10h4ck commented 1 month ago

Hi, I think I've reproduced it on a litebeam-m5-xw

It doesn't work with openwrt-23.05.5, libremesh feeds master, using imagebuilder

root@LiMe-07d1b5:~# shared-state-async get bat-hosts
Error relocating /usr/bin/shared-state-async: _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE15_M_replace_coldEPcjPKcjj: symbol not found

It works with openwrt snapshots, libremesh feeds master, using imagebuilder

root@LiMe-07d1b5:~# shared-state-async get bat-hosts
D 1729878728.079 std::task<int> SharedState::merge(const std::string&, const std::map<std::__cxx11::basic_string<char>, StateEntry>&, const sockaddr_storage&, std::error_condition*) bat-hosts got 7 significative changes out of 7 input slice size: 7 state size: 7
{
    "02:58:47:07:d1:b5": "LiMe_07d1b5_wlan0_mesh_29",
    "02:95:39:07:d1:b5": "LiMe_07d1b5_eth0_29",
    "0e:1e:c4:cf:27:27": "LiMe_07d1b5_bat0",
    "fa:ec:da:06:d1:b5": "LiMe_07d1b5_wlan0_mesh",
    "fc:ec:da:06:d1:b5": "LiMe_07d1b5_wlan0_ap",
    "fc:ec:da:07:d1:b5": "LiMe_07d1b5_br_lan",
    "fe:ec:da:06:d1:b5": "LiMe_07d1b5_wlan0_apname"
}

I think this is the issue, shared-state-async needs OpenWrt development branch due to a bug present in gcc and libstdcpp in current releases. AFAIR also @javierbrk encountered this problem some months ago and the fix was moving to Openwrt development branch. Hopefully news OpenWrt release will include those fixes soon.

New stuff that I develop like shared-state-async or APuP often depends on cutting edge, so I work almost always on top of OpenWrt development branch.

javierbrk commented 3 weeks ago

Just one more note, I'm thinking it is an error with the ci master feeds are compiled with openwrt-sdk master ARCH: "x86_64" so they are only compatible with openwrt snapshot

https://github.com/libremesh/lime-packages/blob/08a3948a5a80f6e26318c96227e02466bd09345e/.github/workflows/build.yml#L27C11-L27C25

we should probably create a feeds repo with binaries built from openwrt sdk v23.05.5 or branch-23.05 to be compatible with openwrt-23.05.5 and current builds of libremesh-2024.1-rc1 as had been done for openwrt v19 ARCH: "x86_64-19.07.10"

https://github.com/libremesh/lime-packages/blob/08a3948a5a80f6e26318c96227e02466bd09345e/.github/workflows/build_2020.yml#L19C11-L19C34

I agree, please let me know if i can help with something !

a-gave commented 3 weeks ago

I agree, please let me know if i can help with something !

I did a couple of tests building shared-state-async with openwrt-sdk at branch openwrt-23.05 an installing it:

Since it seems enough true (checked changelogs on openwrt.org for openwrt-23.05.x releases) that openwrt doesn't not update these packages (libc++, musl libc, libstdcpp) within service releases (minors es. 23.05.4 -> 23.05.5)

It should be doable to use the openwrt-sdk based on the branch openwrt-23.05 (and now also openwrt-24.10) - instead of having one for each release (v23.05.5 ) - to recompile the packages in c++ (shared-state-async only for now) for each architecture

I would run the job multi-arch-build.yml only if packages in c++ (share-state-async) changes for all openwrt branchs that we would like to keep (eg. main, openwrt-23.05, openwrt-24.10)

To have two new feeds, for example for x86_64: https://github.com/libremesh/lime-feed/tree/gh-pages/arch_packages/openwrt-23.05/x86_64 https://github.com/libremesh/lime-feed/tree/gh-pages/arch_packages/openwrt-24.10/x86_64

Available then for imagebuilder usage at: https://feed.libremesh.org/arch_packages/openwrt-23.05/x86_64/ https://feed.libremesh.org/arch_packages/openwrt-24.10/x86_64/

And, for releases or release candidates, I would keep the same new 3 builds mechanism to support imagebuilder users and to do so changing the path of feed from https://github.com/libremesh/lime-feed/tree/gh-pages/arch_packages/2024.1-rc1/x86_64 to https://github.com/libremesh/lime-feed/tree/gh-pages/arch_packages/2024.1-rc1/<[main|openwrt-23.05|openwrt-24.10]>/x86_64

And adjusting accordingly /packages/lime-system/files/etc/uci-defaults/92_add-lime-repos

I could try in case to take a look at this, probably starting from the next week 10/11