Closed delhiryder closed 4 years ago
btw, running a simple (manual) rauc install on the downloaded rauc bundle works fine.
Let me know what (other) information I can share to help you debug.
Thanks,
Sidd
@delhiryder If you could build it with debugging symbols on and attach gdb to generate a backtrace of the segfault, we could possibly encircle the problem.
I would have assumed that there is an unhandled failure path in case of a rauc service startup failure, but if it works fine if you invoke it manually, then it should be something else.
For now, I was not yet able to reproduce it, yet.
Regards, Enrico
Not closed. Just used the wrong button...
Hello,
As suggested by you, we built rauc-hawkbit-updater with debug symbols for the target, and ran the whole application under GDB. Here is the stdout log of my interactions. I was able to get to line 516 in rauc-installer-gen.c before the thread exited.
(Note: Sensitive information has been removed below).
Please let me know if need any more information.
Regards,
Sidd
(gdb) l install_loop_thread
126 * Install mainloop running until installation completes.
127 * @param[in] data pointer to a install_context struct.
128 * @return NULL is always returned.
129 */
130 static gpointer install_loop_thread(gpointer data)
131 {
132 GBusType bus_type = (!g_strcmp0(g_getenv("DBUS_STARTER_BUS_TYPE"), "session"))
133 ? G_BUS_TYPE_SESSION : G_BUS_TYPE_SYSTEM;
134 RInstaller *r_installer_proxy = NULL;
135 GError *error = NULL;
(gdb)
136 struct install_context *context = data;
137 g_main_context_push_thread_default(context->loop_context);
138
139 g_debug("Creating RAUC DBUS proxy");
140 r_installer_proxy = r_installer_proxy_new_for_bus_sync(bus_type,
141 G_DBUS_PROXY_FLAGS_GET_INVALIDATED_PROPERTIES,
142 "de.pengutronix.rauc", "/", NULL, &error);
143 if (r_installer_proxy == NULL) {
144 g_printerr("Error creating proxy: %s\n", error->message);
145 g_clear_error(&error);
(gdb) b 140
Breakpoint 1 at 0x15bd4: file /XXX/src/rauc-hawkbit-updater2/src/rauc-installer.c, line 140.
(gdb) r -d -c /XXX/config.conf
Starting program: /usr/bin/rauc-hawkbit-updater -d -c /XXX/config.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
...
"_links" : {
"download" : {
"href" : "https://cdn.eu1.bosch-iot-rollouts.com/<tenant-ID>/<Token Information>"
}
}
}
]
}
]
}
}
MESSAGE: New software ready for download. (Name: phytec0-iota2-bundle, Version: 32, Size: 105440931, URL: https://cdn.eu1.bosch-iot-rollouts.com/<downloadURL-details>)
[New Thread 0xb64d83a0 (LWP 535)]
MESSAGE: Start downloading: https://cdn.eu1.bosch-iot-rollouts.com/<downloadURL-details>
[New Thread 0xb5aff3a0 (LWP 536)]
[Thread 0xb5aff3a0 (LWP 536) exited]
DEBUG: Request body: {"id":"168981","time":"20200402T094208","status":{"result":{"progress":{"of":1,"cnt":3},"finished":"none"},"execution":"proceeding","details":["Download complete. 2.69 MB/s"]}}
[New Thread 0xb5aff3a0 (LWP 537)]
[Thread 0xb5aff3a0 (LWP 537) exited]
DEBUG: Feedback progress status: 200, URL: https://device.eu1.bosch-iot-rollouts.com/<tenant-id>/controller/v1/<controller-id>/deploymentBase/168981/feedback
MESSAGE: Download complete. 2.69 MB/s
MESSAGE: File checksum OK.
DEBUG: Request body: {"id":"168981","time":"20200402T094209","status":{"result":{"progress":{"of":1,"cnt":3},"finished":"none"},"execution":"proceeding","details":["File checksum OK."]}}
[New Thread 0xb5aff3a0 (LWP 538)]
[Thread 0xb5aff3a0 (LWP 538) exited]
DEBUG: Feedback progress status: 200, URL: https://device.eu1.bosch-iot-rollouts.com/<tenant-ID>/controller/v1/<controller-ID>/deploymentBase/168981/feedback
[New Thread 0xb5aff3a0 (LWP 539)]
[Thread 0xb64d83a0 (LWP 535) exited]
DEBUG: Creating RAUC DBUS proxy
[Switching to Thread 0xb5aff3a0 (LWP 539)]
Thread 8 "installer" hit Breakpoint 1, install_loop_thread (data=0xb50fe758)
at /XXX/src/rauc-hawkbit-updater2/src/rauc-installer.c:140
140 r_installer_proxy = r_installer_proxy_new_for_bus_sync(bus_type,
(gdb) s
r_installer_proxy_new_for_bus_sync (bus_type=G_BUS_TYPE_SYSTEM,
flags=G_DBUS_PROXY_FLAGS_GET_INVALIDATED_PROPERTIES,
name=0x1da94 "de.pengutronix.rauc", object_path=0x1da90 "/",
cancellable=0x0, error=0xb5afed80)
at /XXX/src/rauc-hawkbit-updater2/src/rauc-installer-gen.c:1775
1775 ret = g_initable_new (R_TYPE_INSTALLER_PROXY, cancellable, error, "g-flags", flags, "g-name", name, "g-bus-type", bus_type, "g-object-path", object_path, "g-interface-name", "de.pengutronix.rauc.Installer", NULL);
(gdb) s
0x0001ac00 in r_installer_proxy_get_type ()
at /XXX/src/rauc-hawkbit-updater2/src/rauc-installer-gen.c:1364
1364 G_DEFINE_TYPE_WITH_CODE (RInstallerProxy, r_installer_proxy, G_TYPE_DBUS_PROXY,
(gdb) s
r_installer_get_type ()
at /XXX/src/rauc-hawkbit-updater2/src/rauc-installer-gen.c:516
516 G_DEFINE_INTERFACE (RInstaller, r_installer, G_TYPE_OBJECT)
(gdb) info locals
g_define_type_id__volatile = 0
(gdb) p RInstaller
Attempt to use a type name as an expression
(gdb) p r_installer
No symbol "r_installer" in current context.
(gdb) s
[New Thread 0xb4cff3a0 (LWP 542)]
[New Thread 0xb44fe3a0 (LWP 543)]
Thread 10 "gdbus" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb44fe3a0 (LWP 543)]
0xb6d7911c in ?? () from /usr/lib/libglib-2.0.so.0
gdb) bt
#0 0xb6d7911c in ?? () from /usr/lib/libglib-2.0.so.0
#1 0xb6d7a13c in g_slice_alloc () from /usr/lib/libglib-2.0.so.0
#2 0x00000028 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Hi delhiryder
What version of GLib are you running? GLib generates rauc-installer-gen.c from rauc-installer.xml, please post rauc-installer-gen.c
Thanks, Lasse
Dear Lasse,
I'm pretty sure the version is 2.0 (I may be wrong, though).
I was unable to run ldd (not installed), and attempting to execute the .so file itself caused a seg fault, so I ran:
root@phyboard-wega-am335x-2:/# find . -type f -name 'glib'
and got the following results:
./usr/lib/python3.5/site-packages/gbulb/pycache/glib_events.cpython-35.pyc ./usr/lib/python3.5/site-packages/gbulb/glib_events.py ./usr/lib/python3.5/site-packages/setuptools/glibc.py ./usr/lib/python3.5/site-packages/setuptools/pycache/glibc.cpython-35.pyc ./usr/lib/libglib-2.0.so.0.5400.3 ./usr/lib/libjson-glib-1.0.so.0.400.2 ./usr/share/locale/fr/LC_MESSAGES/glib20.mo ./usr/share/locale/fr/LC_MESSAGES/json-glib-1.0.mo ./usr/share/locale/de/LC_MESSAGES/glib20.mo ./usr/share/locale/de/LC_MESSAGES/json-glib-1.0.mo
I am basing my conclusion on the .so file found by the search.
Also, am attaching the -gen file you asked for.
Regards,
delhiryder rauc-installer-gen.c.txt
Good evening,
I was able to determine the version of GLIB we are using on this board - 2.54.3. We even attempted again after removing all compiler optimization from the library, in the hopes that this may fix whatever inherent issue might be showing up.
Unfortunately, we are still getting the same SIGSEGV error (albeit with a slightly different stack trace, which is still corrupted).
Is there a minimum required version of GLIB that you have tested with on your end ?
Regards,
Sidd
New Thread 0xb59ff3a0 (LWP 440)]
[Thread 0xb59ff3a0 (LWP 440) exited]
DEBUG: Feedback progress status: 200, URL: https://device.eu1.bosch-iot-rollouts.com/<tenant-id>/controller/v1/<controller-id>/deploymentBase/172172/feedback
[New Thread 0xb59ff3a0 (LWP 441)]
DEBUG: Creating RAUC DBUS proxy
[Thread 0xb63ec3a0 (LWP 436) exited]
[New Thread 0xb4bff3a0 (LWP 442)]
[New Thread 0xb43fe3a0 (LWP 443)]
Thread 10 "gdbus" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb43fe3a0 (LWP 443)]
0xb6cc7cec in ?? () from /usr/lib/libglib-2.0.so.0
(gdb) bt
#0 0xb6cc7cec in ?? () from /usr/lib/libglib-2.0.so.0
#1 0xb6cc6bcc in ?? () from /usr/lib/libglib-2.0.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
@delhiryder if you are in development, is there any reason to have a glib version of 2017 in use? If possible, Please update your glib and check if the issue remains.
And, if the error remains, please attempt to also build glib with debugging symbols on, or load debugging symbols into your GDB so we get a useful stack trace for glib, too.
Regards, Enrico
OK. We'll attempt to update. Am assuming 2.64.2 is a good enough version ?
@delhiryder should be, really. It is even more recent than the latest tag on GitHub... ;) https://github.com/GNOME/glib/releases
Hello,
We ran into some dependency issues (and out of time) trying to rebuild glib-2.0. I instead focused on getting the rauc-hawkbit (python) client working, and am happy to report that it works ! (took some getting used to, and had to re-install some aiohttp depenendencies).
Will come back to this issue when I have some more time, but it can be closed (for now).
Thank for all the help !
Regards,
delhiryder
@delhiryder do you use any kind of build systems like Yocto, Buildroot or PTXdist?
@ejoerns yes, we use a customized yocto build from Phytec (which we are in turn customizing).
@delhiryder Then the way to go is to update the BSP, e.g. to zeus. Otherwise, you use are using outdated software anyway. This is of irresponsible in terms of security and makes you run into issues people solved a long time ago... ;)
We are facing the same problem.
Checking for updates and downloading the bundle works fine, but when the rauc-hawkbit-updater tries to inform RAUC to install the update via D-Bus we also get a Segmentation fault
right after DEBUG: Creating RAUC DBUS proxy
.
We are using glib v2.58.3 from Jan 2019 from Poky warrior.
I have the same problem.
After the successful download and right after DEBUG: Creating RAUC DBUS proxy
the updater throws a segfault. When debugging with gdb i get this output:
DEBUG: Creating RAUC DBUS proxy
[Thread 0xb66793e0 (LWP 806) exited]
[New Thread 0xb50ff3e0 (LWP 812)]
[New Thread 0xb48fe3e0 (LWP 813)]
Thread 10 "gdbus" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb48fe3e0 (LWP 813)]
0xb6dda116 in slab_allocator_alloc_chunk (chunk_size=chunk_size@entry=40)
at ../glib-2.58.3/glib/gslice.c:1332
warning: Source file is more recent than executable.
1332 chunk = allocator->slab_stack[ix]->chunks;
I am also using warrior and glib v2.58.3.
I also facing the same issue when move to latest rauc-hawkbit-updater commit. Previously, I was on repo from prevas (0985a7bf72f31c9d17c2687f21c5c682593cff71) and was ok. Right now, by comparing code, I am not able to determine what's change generate the issue.
@prevas-lkmi @ejoerns If a lot of people have problems with segfaults. Should we try to make a stable branch (or similar approach)? So people don't get these problems.
From my test:
removing in
struct config* load_config_file(const gchar* config_file, GError** error)
g_autofree GKeyFile ini_file = g_key_file_new(); => GKeyFile ini_file = g_key_file_new();
avoid the SEGV.
of course g_key_file_free need to be called before each return.
BTW, I like the code in general, but I dislike g_autofree statement also on other variable - I am afraid some memory leak comes from this.
@hpatriarche I can confirm that this did resolve the error for us too.
I'm trying to figure out what is causing this problem. I can't reproduce it. @hpatriarche and @fshrOSB can you help me out? Please try this and report back.
g_autoptr(GKeyFile) *ini_file = g_key_file_new();
@prevas-lkmi I also do not get the segmentation fault that way.
@prevas-lkmi same for me but I got warning compilation
warning: passing argument 1 of ‘glib_autoptr_cleanup_GKeyFile’ from incompatible pointer type [-Wincompatible-pointer-types]
g_autoptr(GKeyFile) *ini_file = g_key_file_new();
@prevas-lkmi I did not have time to look into this deeper, but this is a very good finding what @hpatriarche revealed here. I have no clue how this is related to where we see the segfault yet, but this is definitely a bug.
To use g_autofree
for GKeyFile
should always be wrong as this simply tells to use g_free()
as the freeing function while with gautoptr(GKeyFile)
we tell gcc to use the auto cleanup function registered for this type. This is in this case g_key_file_unref()
. Thus I assume we clean up things too early and with the wrong function and thus end up with the problems described.
Maybe I find some time tomorrow to have a second look on this, but fixing it shuld be right, anyway.
Note that g_autoptr()
also assumes to define a pointer, thus it is
- g_autoptr(GKeyFile) *ini_file = g_key_file_new();
+ g_autoptr(GKeyFile) ini_file = g_key_file_new();
@hpatriarche This should cause the compile warning noted above.
@ejoerns agree this is a bug, and g_autofree GKeyFile is wrong. @fshrOSB and @hpatriarche thanks for helping out. Can you test again with
g_autoptr(GKeyFile) ini_file = g_key_file_new();
@prevas-lkmi For me this does also prevent the segmentation fault from occurring. It also gets rid of the warnings @hpatriarche mentioned.
Hi, We're trying to integrate the rauc-hawkbit-updater service into an embedded SOM for a client. We are using Bosch IOT Rollouts as the hawkbit server, and have been able to successfully download a rauc bundle. Unfortunately, it is not getting beyond this.
Here is a snippet from the console logs (tenant ID and controller id removed below)
Any ideas what might be happening here ?
Thanks,
Sidd (delhiryder)