rauc / rauc-hawkbit-updater

The RAUC hawkBit updater is a simple commandline tool/daemon that runs on your target and interfaces between RAUC and hawkBit's DDI API.
https://rauc-hawkbit-updater.readthedocs.io
GNU Lesser General Public License v2.1
58 stars 35 forks source link

Thread 10 "gdbus" received SIGSEGV. Segmentation Fault #21

Closed delhiryder closed 4 years ago

delhiryder commented 4 years ago

Hi, We're trying to integrate the rauc-hawkbit-updater service into an embedded SOM for a client. We are using Bosch IOT Rollouts as the hawkbit server, and have been able to successfully download a rauc bundle. Unfortunately, it is not getting beyond this.

Here is a snippet from the console logs (tenant ID and controller id removed below)

MESSAGE: Download complete. 2.12 MB/s
MESSAGE: File checksum OK.
DEBUG: Request body: {"id":"168548","time":"20200401T121811","status":{"result":{"progress":{"of":1,"cnt":3},"finished":"none"},"execution":"proceeding","details":["File checksum OK."]}}

DEBUG: Feedback progress status: 200, URL: https://device.eu1.bosch-iot-rollouts.com/<tenant-id>/controller/v1/<controller-id>/deploymentBase/168548/feedback
DEBUG: Creating RAUC DBUS proxy

[17634.526359] audit: type=1701 audit(1585743725.596:169): auid=4294967295 uid=993 gid=990 ses=4294967295 pid=856 comm="gdbus" exe="/usr/bin/rauc-hawkbit-updater" sig=11 res=1
Segmentation fault (core dumped)

Any ideas what might be happening here ?

Thanks,

Sidd (delhiryder)

delhiryder commented 4 years ago

btw, running a simple (manual) rauc install on the downloaded rauc bundle works fine.

Let me know what (other) information I can share to help you debug.

Thanks,

Sidd

ejoerns commented 4 years ago

@delhiryder If you could build it with debugging symbols on and attach gdb to generate a backtrace of the segfault, we could possibly encircle the problem.

I would have assumed that there is an unhandled failure path in case of a rauc service startup failure, but if it works fine if you invoke it manually, then it should be something else.

For now, I was not yet able to reproduce it, yet.

Regards, Enrico

ejoerns commented 4 years ago

Not closed. Just used the wrong button...

delhiryder commented 4 years ago

Hello,

As suggested by you, we built rauc-hawkbit-updater with debug symbols for the target, and ran the whole application under GDB. Here is the stdout log of my interactions. I was able to get to line 516 in rauc-installer-gen.c before the thread exited.

(Note: Sensitive information has been removed below).

Please let me know if need any more information.

Regards,

Sidd

(gdb) l install_loop_thread
126  * Install mainloop running until installation completes.
127  * @param[in] data pointer to a install_context struct.
128  * @return NULL is always returned.
129  */
130 static gpointer install_loop_thread(gpointer data)
131 {
132         GBusType bus_type = (!g_strcmp0(g_getenv("DBUS_STARTER_BUS_TYPE"), "session"))
133                             ? G_BUS_TYPE_SESSION : G_BUS_TYPE_SYSTEM;
134         RInstaller *r_installer_proxy = NULL;
135         GError *error = NULL;
(gdb) 
136         struct install_context *context = data;
137         g_main_context_push_thread_default(context->loop_context);
138 
139         g_debug("Creating RAUC DBUS proxy");
140         r_installer_proxy = r_installer_proxy_new_for_bus_sync(bus_type,
141                                                                G_DBUS_PROXY_FLAGS_GET_INVALIDATED_PROPERTIES,
142                                                                "de.pengutronix.rauc", "/", NULL, &error);
143         if (r_installer_proxy == NULL) {
144                 g_printerr("Error creating proxy: %s\n", error->message);
145                 g_clear_error(&error);
(gdb) b 140
Breakpoint 1 at 0x15bd4: file /XXX/src/rauc-hawkbit-updater2/src/rauc-installer.c, line 140.
(gdb) r -d -c /XXX/config.conf
Starting program: /usr/bin/rauc-hawkbit-updater -d -c /XXX/config.conf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
...
            "_links" : {
              "download" : {
                "href" : "https://cdn.eu1.bosch-iot-rollouts.com/<tenant-ID>/<Token Information>"
              }
            }
          }
        ]
      }
    ]
  }
}

MESSAGE: New software ready for download. (Name: phytec0-iota2-bundle, Version: 32, Size: 105440931, URL: https://cdn.eu1.bosch-iot-rollouts.com/<downloadURL-details>)
[New Thread 0xb64d83a0 (LWP 535)]
MESSAGE: Start downloading: https://cdn.eu1.bosch-iot-rollouts.com/<downloadURL-details>
[New Thread 0xb5aff3a0 (LWP 536)]
[Thread 0xb5aff3a0 (LWP 536) exited]
DEBUG: Request body: {"id":"168981","time":"20200402T094208","status":{"result":{"progress":{"of":1,"cnt":3},"finished":"none"},"execution":"proceeding","details":["Download complete. 2.69 MB/s"]}}

[New Thread 0xb5aff3a0 (LWP 537)]
[Thread 0xb5aff3a0 (LWP 537) exited]
DEBUG: Feedback progress status: 200, URL: https://device.eu1.bosch-iot-rollouts.com/<tenant-id>/controller/v1/<controller-id>/deploymentBase/168981/feedback
MESSAGE: Download complete. 2.69 MB/s
MESSAGE: File checksum OK.
DEBUG: Request body: {"id":"168981","time":"20200402T094209","status":{"result":{"progress":{"of":1,"cnt":3},"finished":"none"},"execution":"proceeding","details":["File checksum OK."]}}

[New Thread 0xb5aff3a0 (LWP 538)]
[Thread 0xb5aff3a0 (LWP 538) exited]
DEBUG: Feedback progress status: 200, URL: https://device.eu1.bosch-iot-rollouts.com/<tenant-ID>/controller/v1/<controller-ID>/deploymentBase/168981/feedback
[New Thread 0xb5aff3a0 (LWP 539)]
[Thread 0xb64d83a0 (LWP 535) exited]
DEBUG: Creating RAUC DBUS proxy
[Switching to Thread 0xb5aff3a0 (LWP 539)]

Thread 8 "installer" hit Breakpoint 1, install_loop_thread (data=0xb50fe758)
    at /XXX/src/rauc-hawkbit-updater2/src/rauc-installer.c:140
140         r_installer_proxy = r_installer_proxy_new_for_bus_sync(bus_type,
(gdb) s
r_installer_proxy_new_for_bus_sync (bus_type=G_BUS_TYPE_SYSTEM, 
    flags=G_DBUS_PROXY_FLAGS_GET_INVALIDATED_PROPERTIES, 
    name=0x1da94 "de.pengutronix.rauc", object_path=0x1da90 "/", 
    cancellable=0x0, error=0xb5afed80)
    at /XXX/src/rauc-hawkbit-updater2/src/rauc-installer-gen.c:1775
1775      ret = g_initable_new (R_TYPE_INSTALLER_PROXY, cancellable, error, "g-flags", flags, "g-name", name, "g-bus-type", bus_type, "g-object-path", object_path, "g-interface-name", "de.pengutronix.rauc.Installer", NULL);
(gdb) s
0x0001ac00 in r_installer_proxy_get_type ()
    at /XXX/src/rauc-hawkbit-updater2/src/rauc-installer-gen.c:1364
1364    G_DEFINE_TYPE_WITH_CODE (RInstallerProxy, r_installer_proxy, G_TYPE_DBUS_PROXY,
(gdb) s
r_installer_get_type ()
    at /XXX/src/rauc-hawkbit-updater2/src/rauc-installer-gen.c:516
516 G_DEFINE_INTERFACE (RInstaller, r_installer, G_TYPE_OBJECT)
(gdb) info locals
g_define_type_id__volatile = 0
(gdb) p RInstaller
Attempt to use a type name as an expression
(gdb) p r_installer
No symbol "r_installer" in current context.
(gdb) s
[New Thread 0xb4cff3a0 (LWP 542)]
[New Thread 0xb44fe3a0 (LWP 543)]

Thread 10 "gdbus" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb44fe3a0 (LWP 543)]
0xb6d7911c in ?? () from /usr/lib/libglib-2.0.so.0
gdb) bt
#0  0xb6d7911c in ?? () from /usr/lib/libglib-2.0.so.0
#1  0xb6d7a13c in g_slice_alloc () from /usr/lib/libglib-2.0.so.0
#2  0x00000028 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
prevas-lkmi commented 4 years ago

Hi delhiryder

What version of GLib are you running? GLib generates rauc-installer-gen.c from rauc-installer.xml, please post rauc-installer-gen.c

Thanks, Lasse

delhiryder commented 4 years ago

Dear Lasse,

I'm pretty sure the version is 2.0 (I may be wrong, though).

I was unable to run ldd (not installed), and attempting to execute the .so file itself caused a seg fault, so I ran:

root@phyboard-wega-am335x-2:/# find . -type f -name 'glib'

and got the following results:

./usr/lib/python3.5/site-packages/gbulb/pycache/glib_events.cpython-35.pyc ./usr/lib/python3.5/site-packages/gbulb/glib_events.py ./usr/lib/python3.5/site-packages/setuptools/glibc.py ./usr/lib/python3.5/site-packages/setuptools/pycache/glibc.cpython-35.pyc ./usr/lib/libglib-2.0.so.0.5400.3 ./usr/lib/libjson-glib-1.0.so.0.400.2 ./usr/share/locale/fr/LC_MESSAGES/glib20.mo ./usr/share/locale/fr/LC_MESSAGES/json-glib-1.0.mo ./usr/share/locale/de/LC_MESSAGES/glib20.mo ./usr/share/locale/de/LC_MESSAGES/json-glib-1.0.mo

I am basing my conclusion on the .so file found by the search.

Also, am attaching the -gen file you asked for.

Regards,

delhiryder rauc-installer-gen.c.txt

delhiryder commented 4 years ago

Good evening,

I was able to determine the version of GLIB we are using on this board - 2.54.3. We even attempted again after removing all compiler optimization from the library, in the hopes that this may fix whatever inherent issue might be showing up.

Unfortunately, we are still getting the same SIGSEGV error (albeit with a slightly different stack trace, which is still corrupted).

Is there a minimum required version of GLIB that you have tested with on your end ?

Regards,

Sidd

New Thread 0xb59ff3a0 (LWP 440)]
[Thread 0xb59ff3a0 (LWP 440) exited]
DEBUG: Feedback progress status: 200, URL: https://device.eu1.bosch-iot-rollouts.com/<tenant-id>/controller/v1/<controller-id>/deploymentBase/172172/feedback
[New Thread 0xb59ff3a0 (LWP 441)]
DEBUG: Creating RAUC DBUS proxy
[Thread 0xb63ec3a0 (LWP 436) exited]
[New Thread 0xb4bff3a0 (LWP 442)]
[New Thread 0xb43fe3a0 (LWP 443)]

Thread 10 "gdbus" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb43fe3a0 (LWP 443)]
0xb6cc7cec in ?? () from /usr/lib/libglib-2.0.so.0
(gdb) bt
#0  0xb6cc7cec in ?? () from /usr/lib/libglib-2.0.so.0
#1  0xb6cc6bcc in ?? () from /usr/lib/libglib-2.0.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 
ejoerns commented 4 years ago

@delhiryder if you are in development, is there any reason to have a glib version of 2017 in use? If possible, Please update your glib and check if the issue remains.

And, if the error remains, please attempt to also build glib with debugging symbols on, or load debugging symbols into your GDB so we get a useful stack trace for glib, too.

Regards, Enrico

delhiryder commented 4 years ago

OK. We'll attempt to update. Am assuming 2.64.2 is a good enough version ?

ejoerns commented 4 years ago

@delhiryder should be, really. It is even more recent than the latest tag on GitHub... ;) https://github.com/GNOME/glib/releases

delhiryder commented 4 years ago

Hello,

We ran into some dependency issues (and out of time) trying to rebuild glib-2.0. I instead focused on getting the rauc-hawkbit (python) client working, and am happy to report that it works ! (took some getting used to, and had to re-install some aiohttp depenendencies).

Will come back to this issue when I have some more time, but it can be closed (for now).

Thank for all the help !

Regards,

delhiryder

ejoerns commented 4 years ago

@delhiryder do you use any kind of build systems like Yocto, Buildroot or PTXdist?

delhiryder commented 4 years ago

@ejoerns yes, we use a customized yocto build from Phytec (which we are in turn customizing).

ejoerns commented 4 years ago

@delhiryder Then the way to go is to update the BSP, e.g. to zeus. Otherwise, you use are using outdated software anyway. This is of irresponsible in terms of security and makes you run into issues people solved a long time ago... ;)

fshrOSB commented 4 years ago

We are facing the same problem.

Checking for updates and downloading the bundle works fine, but when the rauc-hawkbit-updater tries to inform RAUC to install the update via D-Bus we also get a Segmentation fault right after DEBUG: Creating RAUC DBUS proxy.

We are using glib v2.58.3 from Jan 2019 from Poky warrior.

ctenruh-phytec commented 4 years ago

I have the same problem.

After the successful download and right after DEBUG: Creating RAUC DBUS proxy the updater throws a segfault. When debugging with gdb i get this output:

DEBUG: Creating RAUC DBUS proxy
[Thread 0xb66793e0 (LWP 806) exited]
[New Thread 0xb50ff3e0 (LWP 812)]
[New Thread 0xb48fe3e0 (LWP 813)]

Thread 10 "gdbus" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb48fe3e0 (LWP 813)]
0xb6dda116 in slab_allocator_alloc_chunk (chunk_size=chunk_size@entry=40)
    at ../glib-2.58.3/glib/gslice.c:1332
warning: Source file is more recent than executable.
1332      chunk = allocator->slab_stack[ix]->chunks;

I am also using warrior and glib v2.58.3.

hpatriarche commented 4 years ago

I also facing the same issue when move to latest rauc-hawkbit-updater commit. Previously, I was on repo from prevas (0985a7bf72f31c9d17c2687f21c5c682593cff71) and was ok. Right now, by comparing code, I am not able to determine what's change generate the issue.

Raphexion commented 4 years ago

@prevas-lkmi @ejoerns If a lot of people have problems with segfaults. Should we try to make a stable branch (or similar approach)? So people don't get these problems.

hpatriarche commented 4 years ago

From my test: removing in struct config* load_config_file(const gchar* config_file, GError** error)

g_autofree GKeyFile ini_file = g_key_file_new(); => GKeyFile ini_file = g_key_file_new();

avoid the SEGV.

of course g_key_file_free need to be called before each return.

BTW, I like the code in general, but I dislike g_autofree statement also on other variable - I am afraid some memory leak comes from this.

fshrOSB commented 4 years ago

@hpatriarche I can confirm that this did resolve the error for us too.

prevas-lkmi commented 4 years ago

I'm trying to figure out what is causing this problem. I can't reproduce it. @hpatriarche and @fshrOSB can you help me out? Please try this and report back.

g_autoptr(GKeyFile) *ini_file = g_key_file_new();

fshrOSB commented 4 years ago

@prevas-lkmi I also do not get the segmentation fault that way.

hpatriarche commented 4 years ago

@prevas-lkmi same for me but I got warning compilation

warning: passing argument 1 of ‘glib_autoptr_cleanup_GKeyFile’ from incompatible pointer type [-Wincompatible-pointer-types]
         g_autoptr(GKeyFile) *ini_file = g_key_file_new();
ejoerns commented 4 years ago

@prevas-lkmi I did not have time to look into this deeper, but this is a very good finding what @hpatriarche revealed here. I have no clue how this is related to where we see the segfault yet, but this is definitely a bug.

To use g_autofree for GKeyFile should always be wrong as this simply tells to use g_free() as the freeing function while with gautoptr(GKeyFile) we tell gcc to use the auto cleanup function registered for this type. This is in this case g_key_file_unref(). Thus I assume we clean up things too early and with the wrong function and thus end up with the problems described.

Maybe I find some time tomorrow to have a second look on this, but fixing it shuld be right, anyway.

Note that g_autoptr() also assumes to define a pointer, thus it is

- g_autoptr(GKeyFile) *ini_file = g_key_file_new();
+ g_autoptr(GKeyFile) ini_file = g_key_file_new();

@hpatriarche This should cause the compile warning noted above.

prevas-lkmi commented 4 years ago

@ejoerns agree this is a bug, and g_autofree GKeyFile is wrong. @fshrOSB and @hpatriarche thanks for helping out. Can you test again with

g_autoptr(GKeyFile) ini_file = g_key_file_new();

fshrOSB commented 4 years ago

@prevas-lkmi For me this does also prevent the segmentation fault from occurring. It also gets rid of the warnings @hpatriarche mentioned.