hyperboria / bugs

Peer-to-peer IPv6 networking, secure and near-zero-conf.
154 stars 17 forks source link

Segfault in NodeStore #52

Closed clehner closed 8 years ago

clehner commented 9 years ago

in a05ade40dc31caebaf3aa770aac3ab2ecb02d867 (master):

1432430838 DEBUG NodeStore.c:748 Linking [fc7c:6025:dce5:af5b:7a3f:8343:b581:c851] with [fcbf:7bbc:32e4:0716:bd00:e936:c927:fc14] with label fragment [0000.0000.0000.001f]
1432430838 DEBUG NodeStore.c:129 link[fc7c:6025:dce5:af5b:7a3f:8343:b581:c851]->[fc24:7863:1f09:b03c:03fd:a861:96e8:038b] [0000.0000.0000.0b2f] Splitting link
1432430838 DEBUG NodeStore.c:1011 discoverLinkC( [fc24:7863:1f09:b03c:03fd:a861:96e8:038b]->[fc24:7863:1f09:b03c:03fd:a861:96e8:038b] [0000.0000.0000.0001] )

Program received signal SIGSEGV, Segmentation fault.
handleNews (node=0x0, newReach=129589530, store=0x5555557ee988) at dht/dhtcore/NodeStore.c:651
651         if (newReach < Node_getReach(node)) {
(gdb) bt
#0  handleNews (node=0x0, newReach=129589530, store=0x5555557ee988) at dht/dhtcore/NodeStore.c:651
#1  0x000055555557c2cd in NodeStore_discoverNode (nodeStore=0x5555557ee988, addr=0x19, scheme=0x7fffffffda48,
    inverseLinkEncodingFormNumber=1434722840, milliseconds=93824995490936) at dht/dhtcore/NodeStore.c:1572
#2  0x0000555555581d57 in onResponseOrTimeout (data=0xffffffffffffffff, milliseconds=28, vping=0x555555871898)
    at dht/dhtcore/RouterModule.c:514
#3  0x0000555555581b3b in callback (ping=0x555555849cd8, data=0x55555587a268) at util/Pinger.c:55
#4  Pinger_pongReceived (data=data@entry=0x55555587a268, pinger=<optimized out>) at util/Pinger.c:167
#5  0x0000555555581fc1 in handleIncoming (message=0x7fffffffdd30, vcontext=0x5555557ecd28) at dht/dhtcore/RouterModule.c:459
#6  0x0000555555573d52 in DHTModuleRegistry_handleIncoming (message=0x7fffffffdd30, registry=<optimized out>) at dht/DHTModuleRegistry.c:63
#7  0x000055555558b17d in incomingMsg (pf=<optimized out>, msg=<optimized out>) at dht/Pathfinder.c:385
#8  incomingFromEventIf (msg=0x55555584b838, eventIf=0x5555558099d8) at dht/Pathfinder.c:415
#9  0x000055555559e614 in Iface_send (msg=0x55555584b838, iface=0x555555809d08) at ./interface/Iface.h:69
#10 timeoutTrigger (vASynchronizer=0x555555809d08) at interface/ASynchronizer.c:69
#11 0x00005555555c8681 in uv__run_timers (loop=loop@entry=0x5555557eb2b0) at ../src/unix/timer.c:146
#12 0x00005555555bedd2 in uv_run (loop=0x5555557eb2b0, mode=mode@entry=UV_RUN_DEFAULT) at ../src/unix/core.c:275
#13 0x000055555555fad5 in EventBase_beginLoop (eventBase=eventBase@entry=0x5555557eb268) at util/events/libuv/EventBase.c:83
#14 0x00005555555ab9d8 in Core_main (argc=<optimized out>, argv=<optimized out>) at admin/angel/Core.c:326
#15 0x0000555555559493 in main (argc=3, argv=0x7fffffffe658) at client/cjdroute2.c:462
(gdb)
ghost commented 9 years ago

Are you sure you're on the referenced commit? Because with that commit, the assertion should be <= instead of <.

clehner commented 9 years ago

@lgierth the line that failed is a different line than the one with the assertion that was changed in that commit. https://github.com/hyperboria/cjdns/blob/a05ade40dc31caebaf3aa770aac3ab2ecb02d867/dht/dhtcore/NodeStore.c#L651

ghost commented 9 years ago

Ah I see! Then I guess try changing it there too ;)

clehner commented 9 years ago

The segfault is caused by trying to access node->reach_pvt in Node_getReach when node is null. I could add a null check in handleNews or Node_getReach, but I don't know what it means for it to be null in the first place. Is that expected or does it indicate a bug somewhere else? Going up the stack, the null node passed to handleNews is the result of link->child in NodeStore_discoverNode. link is returned from discoverLink. I can't check other fields of link because gdb says the value is optimized out. The only places where ->child is assigned to a Node_Link are linkNodes and NodeStore_unlinkNodes. So I think either discoverLink is returning something bad, or the link is being used after being unlinked.

ghost commented 8 years ago

Reach has been replaced by the cost metric, so I assume this segfault is gone.