Open bodqhrohro opened 2 years ago
==1643036== 1 errors in context 1 of 9:
==1643036== Invalid read of size 8
==1643036== at 0x11C1AB75: download_retry_alarm (queries.c:3257)
==1643036== by 0x11C00D77: timer_alarm (tgp-timers.c:35)
==1643036== by 0x5490743: g_timeout_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x548FBE3: g_main_context_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x548FF87: g_main_context_iterate.constprop.0 (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x549003E: g_main_context_iteration (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x4C48D90: gtk_main_iteration (in /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0.2400.33)
==1643036== by 0xBA3D654: conversation_created_cb (plugin.c:91)
==1643036== by 0x574D81B: purple_signal_emit_vargs (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x574D9AD: purple_signal_emit (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x57166DB: purple_conversation_new (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x574C86B: serv_got_joined_chat (in /usr/lib/libpurple.so.0.14.8)
==1643036== Address 0x10 is not stack'd, malloc'd or (recently) free'd
==1643036==
==1643036==
==1643036== 1 errors in context 2 of 9:
==1643036== Invalid read of size 8
==1643036== at 0x11C1AB69: download_retry_alarm (queries.c:3257)
==1643036== by 0x11C00D77: timer_alarm (tgp-timers.c:35)
==1643036== by 0x5490743: g_timeout_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x548FBE3: g_main_context_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x548FF87: g_main_context_iterate.constprop.0 (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x549003E: g_main_context_iteration (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x4C48D90: gtk_main_iteration (in /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0.2400.33)
==1643036== by 0xBA3D654: conversation_created_cb (plugin.c:91)
==1643036== by 0x574D81B: purple_signal_emit_vargs (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x574D9AD: purple_signal_emit (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x57166DB: purple_conversation_new (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x574C86B: serv_got_joined_chat (in /usr/lib/libpurple.so.0.14.8)
==1643036== Address 0x19cb7668 is 9,000 bytes inside an unallocated block of size 9,120 in arena "client"
==1643036==
==1643036==
==1643036== 1 errors in context 3 of 9:
==1643036== Invalid read of size 4
==1643036== at 0x11C1AB31: download_retry_alarm (queries.c:3253)
==1643036== by 0x11C00D77: timer_alarm (tgp-timers.c:35)
==1643036== by 0x5490743: g_timeout_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x548FBE3: g_main_context_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x548FF87: g_main_context_iterate.constprop.0 (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x549003E: g_main_context_iteration (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==1643036== by 0x4C48D90: gtk_main_iteration (in /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0.2400.33)
==1643036== by 0xBA3D654: conversation_created_cb (plugin.c:91)
==1643036== by 0x574D81B: purple_signal_emit_vargs (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x574D9AD: purple_signal_emit (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x57166DB: purple_conversation_new (in /usr/lib/libpurple.so.0.14.8)
==1643036== by 0x574C86B: serv_got_joined_chat (in /usr/lib/libpurple.so.0.14.8)
==1643036== Address 0x19cb6fc0 is 7,296 bytes inside an unallocated block of size 9,120 in arena "client"
@himselfv, could you take a look please? Seems like this is related to changes in https://github.com/majn/tgl/commit/64dcceac9ae696c482a1f423ee815043d2651d0e, but I'm not sure yet where ri->ev
can be prematurely freed.
@bodqhrohro
It sort of looks like everything else is broken. At 3253, only ri->D->
is dereferenced, and ri
is asserted above, so it must be ri->D
. At 3257, it's
TLS->timer_methods->remove (ri->ev);
But ev
itself is not dereferenced here, and we're not yet in the remove
call, so it's either ri->
(can't be, asserted), or TLS->timer_methods->
, and timer_methods->remove
is exactly $10, so I guess it's that.
Any details on when the crash happened? Logs?
There are no debug logs related to telegram-purple right before in this case, only some usual XMPP/GDK chitchatting. Happens when Pidgin is starting and telegram-purple fetches the messages it missed, possibly on reconnects too.
So apparently, to understand what's happening I had to look at Issue 580 which details the setup and gives more logs.
A connected instance of Pidgin is frozen, then network is disconnected/reconnected, forcing Pidgin to quickly disconnect/reconnect on unfreeze. The disconnection destroys the TLS context, but some delayed download requests linger as timer_alarms for up to 10 seconds, and then fire with the broken TLS pointer.
If this is the reason, it should not happen on initial connect.
==287030== Invalid read of size 1
==287030== at 0x4841CE2: __strlen_sse2 (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==287030== by 0x59E25E7: __vfprintf_internal (vfprintf-internal.c:1647)
==287030== by 0x59F3145: __vsnprintf_internal (vsnprintf.c:114)
==287030== by 0x11C50CE6: vsnprintf (stdio2.h:80)
==287030== by 0x11C50CE6: tgl_snprintf (tools.c:113)
==287030== by 0x11C2E957: download_next_part (queries.c:3323)
==287030== by 0x11C2EB9C: download_retry_alarm (queries.c:3261)
==287030== by 0x11C14D77: timer_alarm (tgp-timers.c:35)
==287030== by 0x54A4743: g_timeout_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A3BE3: g_main_context_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A3F87: g_main_context_iterate.constprop.0 (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A403E: g_main_context_iteration (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x4C48D90: gtk_main_iteration (in /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0.2400.33)
==287030== Address 0x1f813160 is 0 bytes inside a block of size 65 free'd
==287030== at 0x483F9AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==287030== by 0x11C46D79: tgl_free_str (tools.h:111)
==287030== by 0x11C46D79: tgl_free_all (structures.c:2607)
==287030== by 0x11C1AAAF: connection_data_free (tgp-structs.c:135)
==287030== by 0x5725D69: _purple_connection_destroy (in /usr/lib/libpurple.so.0.14.8)
==287030== by 0x5711E6D: purple_account_disconnect (in /usr/lib/libpurple.so.0.14.8)
==287030== by 0x5724F15: purple_connection_disconnect_cb (in /usr/lib/libpurple.so.0.14.8)
==287030== by 0x54A4743: g_timeout_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A3BE3: g_main_context_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A3F87: g_main_context_iterate.constprop.0 (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A403E: g_main_context_iteration (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x4C48D90: gtk_main_iteration (in /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0.2400.33)
==287030== by 0xBA51654: conversation_created_cb (plugin.c:91)
==287030== Block was alloc'd at
==287030== at 0x483E77F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==287030== by 0x11C5065F: tgl_alloc_release (tools.c:204)
==287030== by 0x11C50ED5: tgl_strdup (tools.c:217)
==287030== by 0x11C4C131: tgl_set_download_directory (tgl.c:54)
==287030== by 0x11C1888F: tgprpl_login (telegram-purple.c:548)
==287030== by 0x165968: do_signon (in /usr/bin/pidgin)
==287030== by 0x54A4743: g_timeout_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A3BE3: g_main_context_dispatch (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A3F87: g_main_context_iterate.constprop.0 (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x54A403E: g_main_context_iteration (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7000.2)
==287030== by 0x4C48D90: gtk_main_iteration (in /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0.2400.33)
==287030== by 0xBA51654: conversation_created_cb (plugin.c:91)
To fix this, I guess download_retry code should keep a list of all outstanding download_retry timers and cancel all of them on TLS* destruction (tgl_free_all).
Or maybe TLS should just cancel ALL timers created via TLS->timer_methods, on its destruction? I mean, all of them receive the TLS pointer, so none can be safely called after it's destroyed. Summoning @EionRobb, if they even care about this project anymore.
Or maybe TLS* should just cancel ALL timers created via TLS->timer_methods, on its destruction?
Prpls written by @EionRobb themself do, as a rule of thumb. That's what made me sure that corruptions happen in some other place before the timers there are fired, BTW. I had polluted numerous bug trackers for years because of this before finding the reason (time to close all of that, probably).
Yay! Glad you found it :)
@bodqhrohro so the fix is to move to tdlib-purple? :)
@bodqhrohro so the fix is to move to tdlib-purple? :)
Nope, I told already that I just rolled back to the tgl revision before the patch. It wasn't working ideally anyway, as I still was getting lots of non-downloaded pictures.
Another use-after-free, seems like uncleared callbacks are at guilt there too: pidgin-vg14.txt.gz