vysheng / tgl

telegram library
GNU Lesser General Public License v2.1
197 stars 224 forks source link

Crash when network failure while sending a message #103

Open BenWiederhake opened 8 years ago

BenWiederhake commented 8 years ago

Steps to reproduce:

Expected behavior: Not a crash

Actual behavior and gdb backtrace:

*** Error in `/usr/bin/pidgin': double free or corruption (!prev): 0x000055555677d4f0 ***

Program received signal SIGABRT, Aborted.
0x00007ffff449acc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: Datei oder Verzeichnis nicht gefunden.
(gdb) bt full
#0  0x00007ffff449acc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
[SNIP]
#1  0x00007ffff449e0d8 in __GI_abort () at abort.c:89
[SNIP]
#2  0x00007ffff44d7394 in __libc_message (do_abort=do_abort@entry=1,
[SNIP]
#3  0x00007ffff44e366e in malloc_printerr (ptr=<optimized out>, str=0x7ffff45e5c10 "double free or
[SNIP]
#4  _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
[SNIP]
#5  0x00007fffd5deede8 in tree_act_ex_message (T=<optimized out>, act=act@entry=0x7fffd5df29e0 <tgls_free_message_gw>, 
    extra=extra@entry=0x5555565d0c50) at structures.c:84
No locals.
#6  0x00007fffd5df670e in tgl_free_all (TLS=0x5555565d0c50) at structures.c:2471
        i = <optimized out>
#7  0x00007fffd5dd225f in connection_data_free (conn=0x555556516250) at tgp-structs.c:125
No locals.
#8  0x00007ffff4da6446 in _purple_connection_destroy () from /usr/lib/libpurple.so.0
No symbol table info available.

Thus, structures.c:2471 tries to do free a message in TLS->message_unsent_tree that already was freed.

Note that this bug was discovered in https://github.com/majn/tgl/commit/bec66c25d52368695e10625f956de5250a2d836d , but there have been no changes to TLS->message_unsent_tree since then.

BenWiederhake commented 8 years ago

Added a few debug messages. Only the arrows are added afterwards:

(14:45:03) prpl-telegram: Begin tgl_free_all
(14:45:03) prpl-telegram: freeing TLS->message_tree
(14:45:03) prpl-telegram: freeing message at 0x55555673c540
(14:45:03) prpl-telegram: freeing message at 0x555556715bc0
(14:45:03) prpl-telegram: freeing message at 0x55555673c250
(14:45:03) prpl-telegram: freeing message at 0x55555673bb30
(14:45:03) prpl-telegram: freeing message at 0x55555673b640
(14:45:03) prpl-telegram: freeing message at 0x555556715950
(14:45:03) prpl-telegram: freeing message at 0x55555685dc00 <---------
(14:45:03) prpl-telegram: freeing message at 0x5555566f1250
(14:45:03) prpl-telegram: freeing message at 0x5555566c7690
(14:45:03) prpl-telegram: freeing message at 0x55555673c3c0
[SNIP]
(14:45:03) prpl-telegram: freeing message at 0x5555566f0910
(14:45:03) prpl-telegram: freeing message at 0x55555673d7e0
(14:45:03) prpl-telegram: freeing message at 0x555556715a70
(14:45:03) prpl-telegram: freeing message at 0x5555566e5af0
(14:45:03) prpl-telegram: freeing message at 0x5555566f0c00
(14:45:03) prpl-telegram: freeing TLS->message_unsent_tree
(14:45:03) prpl-telegram: freeing message at 0x55555685dc00 <---------

So in the event of a network failure (which is momentarily happening quite often as Telegram has a few issues), a messages is not properly moved from the message_unsent_tree to the message_tree.

BenWiederhake commented 8 years ago

It seems to be caused in bl_do_edit:

(17:11:16) prpl-telegram: begin bl_do_edit_message
(17:11:16) prpl-telegram: bl_do_edit_message: flags=0x00014303
(17:11:16) prpl-telegram: tglm_message_alloc M=0x55c8b0d41bc0
(17:11:16) prpl-telegram: tglm_message_insert_tree(M=0x55c8b0d41bc0)
(17:11:16) prpl-telegram: bl_do_edit_message: M->flags=0x00000000
(17:11:16) prpl-telegram: bl_do_edit_message: **NOT** removing unsent
(17:11:16) prpl-telegram: bl_do_edit_message: inserting as normal message
(17:11:16) prpl-telegram: end bl_do_edit_message
[...]
(17:11:20) prpl-telegram: tgprpl_close()
(17:11:20) prpl-telegram: Begin tgl_free_all
(17:11:20) prpl-telegram: freeing TLS->message_tree
(17:11:20) prpl-telegram: freeing message at 0x55c8b0c264a0
(17:11:20) prpl-telegram: freeing message at 0x55c8b0bfb180
(17:11:20) prpl-telegram: freeing message at 0x55c8b0c25d80
(17:11:20) prpl-telegram: freeing message at 0x55c8b0c24bb0
(17:11:20) prpl-telegram: freeing message at 0x55c8b0c24040
(17:11:20) prpl-telegram: freeing message at 0x55c8b0d41bc0 <----------
(17:11:20) prpl-telegram: freeing message at 0x55c8b0bfadf0
(17:11:20) prpl-telegram: freeing message at 0x55c8b0c23cb0
[...]
(17:11:20) prpl-telegram: freeing message at 0x55c8b0bd6ac0
(17:11:20) prpl-telegram: freeing message at 0x55c8b0c22e60
(17:11:20) prpl-telegram: freeing TLS->message_unsent_tree
(17:11:20) prpl-telegram: freeing message at 0x55c8b0d41bc0 <----------
*** Error in `pidgin': double free or corruption (!prev): 0x000055c8b0d41bc0 ***
Aborted (core dumped)

So there's a message in both messagres_tree and messages_unsent_tree. Compared with a "normal" execution, this seems to be business as usual:

(17:17:33) prpl-telegram: begin bl_do_edit_message
(17:17:33) prpl-telegram: bl_do_edit_message: flags=0x00014303
(17:17:33) prpl-telegram: tglm_message_alloc M=0x5582941b2d20
(17:17:33) prpl-telegram: tglm_message_insert_tree(M=0x5582941b2d20)
(17:17:33) prpl-telegram: bl_do_edit_message: M->flags=0x00000000
(17:17:33) prpl-telegram: bl_do_edit_message: **NOT** removing unsent
(17:17:33) prpl-telegram: bl_do_edit_message: inserting as normal message
(17:17:33) prpl-telegram: end bl_do_edit_message
(17:17:33) prpl-telegram: Sent query #6273431346611120128 of size 56 to DC 2
(17:17:33) prpl-telegram: tglm_message_insert_tree(M=0x5582941b2d20)
(17:17:33) prpl-telegram: begin bl_do_edit_message
(17:17:33) prpl-telegram: bl_do_edit_message: flags=0x00004103
(17:17:33) prpl-telegram: bl_do_edit_message: M->flags=0x00004103
(17:17:33) prpl-telegram: bl_do_edit_message: **NOT** removing unsent
(17:17:33) prpl-telegram: bl_do_edit_message: **NOT** inserting as normal message
(17:17:33) prpl-telegram: end bl_do_edit_message
(17:17:33) prpl-telegram: sending all pending recipes
(17:17:33) prpl-telegram: tgl_do_mark_read (115630664)
(17:17:33) prpl-telegram: Sent query #6273431346869932032 of size 28 to DC 2
(17:17:33) prpl-telegram: wrote state file: wpts=70999 wqts=0 wseq=102 wdate=1460647037
(17:17:33) prpl-telegram: wrote secret chat file: 0 chats written.
(17:17:38) prpl-telegram: update 0x1bfbd823 (check=0)
(17:17:39) prpl-telegram: tgprpl_close()
(17:17:39) prpl-telegram: Begin tgl_free_all
(17:17:39) prpl-telegram: freeing TLS->message_tree
(17:17:39) prpl-telegram: freeing message at 0x5582943903d0
(17:17:39) prpl-telegram: freeing message at 0x558294374f70
(17:17:39) prpl-telegram: freeing message at 0x55829438fcb0
(17:17:39) prpl-telegram: freeing message at 0x55829438eab0
(17:17:39) prpl-telegram: freeing message at 0x5582941b2d20 <----------
[...]
(17:17:39) prpl-telegram: freeing message at 0x55829438bce0
(17:17:39) prpl-telegram: freeing TLS->message_unsent_tree
user@machine:~/telegram-purple$ 

However, tgl_free_all is not aware of this duplicity, meaning it tries to free the messages again.

One could make a dirty workaround where tgl_free_all not only frees each message in messages_tree, but also removes it from messages_unsent_tree (... before freeing the message, obviously).

However, I'd rather fix the underlying logic. But in order to do so, I'd need to know what the intended behavior is, and what the lifecycle of a message is supposed to be. Help? @majn

BenWiederhake commented 8 years ago

Implemented custom workaround: branch commit