servalproject / serval-dna

The Serval Project's core daemon that implements Distributed Numbering Architecture (DNA), MDP, VoMP, Rhizome, MeshMS, etc.
http://servalproject.org
Other
170 stars 81 forks source link

crashing serval when listing conversations #124

Closed gh0st42 closed 6 years ago

gh0st42 commented 6 years ago

Message was sent using restful interface. Listing conversations via restful or via command line leads to a crash. Most recent serval-dna version from git.

So far the conversation was one way: Node A -> Node B

Node B can list conversations and read.

Node A just crashes.

Prior to that I accidentally sent a message to my own SID via rest. After removing the database and blobs dir from Node A, Node B syncs back the same conversation and Node A can display it. Sending to self should not corrupt the database but throw an error or just work.

Output from command line tool:

servald: meshms.c:487: write_conversation: Assertion `conv->metadata.their_size >= conv->metadata.their_last_message' failed.

Here the corresponding serval.log entries:

FATAL:[ 3096] 15:39:32.282 [httpd/10] servald_main.c:61:crash_handler()  Caught signal SIGABRT (6) Aborted
FATAL:[ 3096] 15:39:32.282 [httpd/10] servald_main.c:62:crash_handler()  The following clue may help: no clue
FATAL:[ 3096] 15:39:32.282 [httpd/10] performance_timing.c:227:dump_stack()  http_request_receive
FATAL:[ 3096] 15:39:32.282 [httpd/10] performance_timing.c:227:dump_stack()  http_server_poll
FATAL:[ 3096] 15:39:32.282 [httpd/10] performance_timing.c:227:dump_stack()  call_alarm
FATAL:[ 3096] 15:39:32.282 [httpd/10] performance_timing.c:227:dump_stack()  fd_poll2
FATAL:[ 3096] 15:39:32.282 [httpd/10] performance_timing.c:227:dump_stack()  server
FATAL:[ 3096] 15:39:32.282 [httpd/10] performance_timing.c:227:dump_stack()  app_server_start
FATAL:[ 3096] 15:39:32.282 [httpd/10] performance_timing.c:227:dump_stack()  cli_invoke
FATAL:[ 3096] 15:39:32.282 [httpd/10] performance_timing.c:227:dump_stack()  commandline_main
FATAL:[ 3096] 15:39:32.282 [httpd/10] servald_main.c:64:crash_handler()  GDB BACKTRACE
gh0st42 commented 6 years ago

Just saw that after the resync I got two conversations:

{u'header': [u'_id', u'my_sid', u'their_sid', u'read', u'last_message', u'read_offset'], 
u'rows': [
[0, u'7929BCAAA61F80ADD6227E04C477CAB96B5A7EE2BD965A7CC74105830B7C7465', u'7929BCAAA61F80ADD6227E04C477CAB96B5A7EE2BD965A7CC74105830B7C7465', False, 19, 0], 
[1, u'7929BCAAA61F80ADD6227E04C477CAB96B5A7EE2BD965A7CC74105830B7C7465', u'CFF7493114DF7A9C0A3E07F960E6CF644ECF6AA62EB1787029CD7E130DD2B065', False, 43, 0]
]}

The first one with one-self and the other with the remote peer! But after the resync it seems to run stable..

lakeman commented 6 years ago

Talking to yourself is a little silly and unlikely to ever be useful, but avoiding the situation that led to the crash is simple enough. We just need to ensure that we always treat our half of the conversation as "ours" and not "theirs".

You might need to purge your private conversation bundle to avoid hitting the assertion, which is still there.