kensanata / bitlbee-mastodon

A Mastodon plugin for Bitlbee
https://alexschroeder.ch/software/Bitlbee_Mastodon
GNU General Public License v2.0
30 stars 7 forks source link

Segfault when mastodon closes the stream #7

Closed iguanaonmystack closed 6 years ago

iguanaonmystack commented 6 years ago

I'm getting on really well with the mastodon plugin but unfortunately every time mastodon closes the stream (eg mastodon - Error: Stream closed (200 OK)) bitlbee segfaults.

BitlBee-3.5.1+20171123+master+30-g4a9c6b0-git

Mastodon from git at 4a0262752105eb094a8f5ecfc2708f5b7f9c4e64 (HEAD of master at the time of writing)

My current workaround is running bitlbee in valgrind, which prevents the segfault.

sudo -u bitlbee valgrind --log-file=/tmp/valgrind.log /usr/sbin/bitlbee -Dnv 

Valgrind output:

==1262== Memcheck, a memory error detector
==1262== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==1262== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==1262== Command: /usr/sbin/bitlbee -Dnv
==1262== Parent PID: 1261
==1262== 
==1262== Invalid read of size 4
==1262==    at 0x13BE34: ssl_disconnect (ssl_gnutls.c:443)
==1262==    by 0x1359D2: http_close (http_client.c:700)
==1262==    by 0x13548C: http_incoming_data (http_client.c:304)
==1262==    by 0x1342DE: b_event_passthrough (events_libevent.c:143)
==1262==    by 0x57CB253: event_base_loop (in /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5.1.7)
==1262==    by 0x133EFE: b_main_run (events_libevent.c:84)
==1262==    by 0x11D5DE: main (unix.c:182)
==1262==  Address 0x78323b8 is 24 bytes inside a block of size 56 free'd
==1262==    at 0x4C2987C: free (vg_replace_malloc.c:473)
==1262==    by 0x13543C: http_incoming_data (http_client.c:287)
==1262==    by 0x1342DE: b_event_passthrough (events_libevent.c:143)
==1262==    by 0x57CB253: event_base_loop (in /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5.1.7)
==1262==    by 0x133EFE: b_main_run (events_libevent.c:84)
==1262==    by 0x11D5DE: main (unix.c:182)
==1262== 
[repeats]

Is there anything else I can provide to help debug this?

kensanata commented 6 years ago

Hm. I'm reminded of a previous issue I had regarding this when I was still trying to get the Mastodon in the Bitlbee itself. I finally closed it because it seemed to work well enough. Sadly I found no way to solve it back when I looked at it last year. Patches welcome. Anyway, the key seems to be that memory can get freed twice, or that it can get freed eventhough there are still connections sending data. This is truly the dark side of C and I'm not very familiar with the best approach on how to fix these issues. Sorry. :(

iguanaonmystack commented 6 years ago

I've been taking a look at this and I'm currently running my forked version with a few changes to it.

I think the problem is a race condition between the service being disconnected and some of the callbacks coming back. So I've added some protective code and I'll see how that copes after the next mastodon disconnect. I'm not entirely sure why mastodon suffers from this while twitter doesn't -- is mastodon making additional calls?

kensanata commented 6 years ago

Yes, I think it does. I don't quite remember whether the Twitter code uses the streaming API or not. Perhaps it uses polling? The Mastodon plugin uses the Streaming API and opens a connection for the default timeline and one more for ever hashtag channel, the local and the federated timeline (if you join those group chats/channels). Look for all the requests that end up specifying mastodon_http_stream as a callback. They all set req->flags |= HTTPC_STREAMING and add themselves to md->streams.

At the time I thought that Twitter was simply not dropping the connection as often as my Mastodon instances and that is why the race condition was triggered. I hardly ever noticed bitlbee crashing, but then again, I run it as a user directly and maybe I just never saw it.

iguanaonmystack commented 6 years ago

I think (part of) the solution was a couple of use-after-free in some of the http callbacks. I've further patched my fork and am now running that code. I'll hold off submitting a pull request until the bee can survive a mastodon drop :)

🐘 ☂️ 🐝

kensanata commented 6 years ago

Sounds good to me. Thanks for looking into it.

kensanata commented 6 years ago

I'm ready for patches. :)

iguanaonmystack commented 6 years ago

Mastodon hasn't actually disconnected me since my latest HEAD so I can't say whether the issue is fixed yet. I'm reasonably sure I've sorted out the current latest traceback, but I don't know if more are waiting in store. I suppose more testers would be useful; I can open a pull request if you like? :)

kensanata commented 6 years ago

Sure, please do. Also, if Code quality is simply better, that’s also going to be a win.

iguanaonmystack commented 6 years ago

Okay, I've opened pull request #11

kensanata commented 6 years ago

Thanks!

dequis commented 6 years ago

FWIW today i got

Finishing HTTP request with status: 200 OK
* Error in `/home/dx/bitlbee/bitlbee': double free or corruption (!prev): 0x000055555694e380 *

Program received signal SIGABRT, Aborted.
0x00007ffff5ce58a0 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff5ce58a0 in raise () at /usr/lib/libc.so.6
#1  0x0000000000000000 in  ()

Not very useful, but inspecting the garbage from the stack it shows mastodon reconnection messages, so I guess it's this. I was running the code from when this was a bitlbee PR, so outdated as heck.

Any reason this isn't closed?

kensanata commented 6 years ago

Note merge request #12?

kensanata commented 6 years ago

I think we can close this now? Nothing new seems to have popped up.