Closed Kubuxu closed 8 years ago
the line in question reminds me of another assertion failure I "patched". Apparently at some point pinging yourself caused a failure in some other system, and it was rendered unnecessary by some other logic that handled self pings.
It's possible you may just be able to remove this one as well, but I can't say for sure. There's no documentation indicating how it should behave with self pings. A forward-thinking solution would involve documenting this and other assertion failures, and the reason for their existence.
I'll check back in on this later tonight and see what information I can find, but any digging anyone else wants to do will be greatly appreciated.
PS: I was trying to traceroute when I came across that other assertion, and I guess I never really did too much follow-up. Thanks for raising the issue once again. Traceroute should help us figure out a lot of other issues.
Without the asset in line 631 I get: http://hastebin.com/makeloqoxo
Looks like it is not that easy this time.
interesting! I guess it's time to figure out why nodes can't ping themselves.
One thing. This backtrace might not be connected with anything. I am running those tests on different machine in ie. --nobg mode. This could be error reporting trying to open log file or something.
EDIT: Using tools/cjdnslog
stops this backtrace from appearing but still shows failed syscall 2
.
EDIT2: tools/ping
and tools/traceroute
cause crash both.
This backtrace is only from client exiting:
#0 Assert_failure (format=<optimized out>) at util/Assert.c:40
#1 0x00005555555b2a74 in onCoreExit (exit_status=<optimized out>,
term_signal=<optimized out>) at client/cjdroute2.c:457
#2 0x00005555555c9132 in uv__chld (handle=<optimized out>,
signum=<optimized out>) at ../src/unix/process.c:112
#3 0x00005555555c9b99 in uv__signal_event (loop=0x5555557f25b0,
w=<optimized out>, events=<optimized out>) at ../src/unix/signal.c:386
#4 0x00005555555cfbd4 in uv__io_poll (loop=loop@entry=0x5555557f25b0,
timeout=-1) at ../src/unix/linux-core.c:271
#5 0x00005555555c4fd7 in uv_run (loop=0x5555557f25b0,
mode=mode@entry=UV_RUN_DEFAULT) at ../src/unix/core.c:284
#6 0x000055555555fc75 in EventBase_beginLoop (eventBase=0x5555557f2568)
at util/events/libuv/EventBase.c:83
#7 0x0000555555559ed3 in main (argc=1434415640, argv=0x7fffffffe548)
at client/cjdroute2.c:662
Yep that's probably from the forbidden syscall
Narrowing it down it is "RouteModule_getPeers(0000.0000.0000.0001)".
Good one.
$ /opt/cjdns/tools/cexec 'RouterModule_getPeers("0000.0000.0000.0001")'
30 seconds later:
1435582847 DEBUG Pinger.c:73 Ping timeout for [2965572845] in [30400] ms
1435582847 DEBUG NodeStore.c:2326 Ping timeout for fc06:c135:28a5:8c0b:dd4e:bcb6:d4d6:c96d@0000.0000.0000.0001. changing reach from 4294967295 to 3758096383
Assertion failure [NodeStore.c:631] [(Node_getBestParent(node) && node != store->pub.selfNode)]
Attempted banned syscall number [2] see doc/Seccomp.md for more information
Core exited with status [0], signal [31]
Backtrace (10 frames):
cjdroute(+0x695a) [0x7f86ce96395a]
cjdroute(+0x5edc4) [0x7f86ce9bbdc4]
cjdroute(+0x75df2) [0x7f86ce9d2df2]
cjdroute(+0x76891) [0x7f86ce9d3891]
cjdroute(+0x7cd94) [0x7f86ce9d9d94]
cjdroute(+0x71b47) [0x7f86ce9ceb47]
cjdroute(+0xbbc5) [0x7f86ce968bc5]
cjdroute(+0x5dcd) [0x7f86ce962dcd]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f86cdf67a40]
cjdroute(+0x6399) [0x7f86ce963399]
Aborted (core dumped)
I assume this is the change that introduced the open() syscall: https://github.com/cjdelisle/cjdns/pull/778 (not that it'd make any difference regarding the self-ping error)
This backtrace which we got is unimportant as its cause is syscall sandboxing. This is backtrace of failed assert (which is supposed to get printed but is not):
#0 Assert_failure (format=format@entry=0x7f5f612000e0 "Assertion failure [%s:%d] [%s]\n") at util/Assert.c:29
#1 0x00007f5f611a6fa1 in handleBadNews (node=0x7f5f61f9edd8, newReach=<optimized out>, store=<optimized out>) at dht/dhtcore/NodeStore.c:631
#2 0x00007f5f611a719d in handleNews (node=0x7f5f61f9edd8, newReach=3758096383, store=0x7f5f61fc6598) at dht/dhtcore/NodeStore.c:653
#3 0x00007f5f611ae084 in NodeStore_pathTimeout (nodeStore=0x7f5f61fc6598, path=140047628124891) at dht/dhtcore/NodeStore.c:2328
#4 0x00007f5f611b0169 in onTimeout (pctx=<optimized out>, milliseconds=<optimized out>) at dht/dhtcore/RouterModule.c:429
#5 onResponseOrTimeout (data=0x7f5f612000e0, milliseconds=38100, vping=0x7f5f620018a8) at dht/dhtcore/RouterModule.c:472
#6 0x00007f5f611af7ef in callback (ping=<optimized out>, data=<optimized out>) at util/Pinger.c:55
#7 timeoutCallback (vping=0x7f5f61ffef68) at util/Pinger.c:74
#8 0x00007f5f611fc771 in uv__run_timers (loop=loop@entry=0x7f5f61f9c2b0) at ../src/unix/timer.c:146
#9 0x00007f5f611f2f72 in uv_run (loop=0x7f5f61f9c2b0, mode=mode@entry=UV_RUN_DEFAULT) at ../src/unix/core.c:275
#10 0x00007f5f6118dc75 in EventBase_beginLoop (eventBase=eventBase@entry=0x7f5f61f9c268) at util/events/libuv/EventBase.c:83
#11 0x00007f5f611da778 in Core_main (argc=<optimized out>, argv=<optimized out>) at admin/angel/Core.c:326
#12 0x00007f5f61187593 in main (argc=3, argv=0x7ffca9303f68) at client/cjdroute2.c:467
SessionManager.c drops packets from ourself.
diff --git a/net/SessionManager.c b/net/SessionManager.c
index 123ba5d..a5078b8 100644
--- a/net/SessionManager.c
+++ b/net/SessionManager.c
@@ -266,7 +266,7 @@ static Iface_DEFUN incomingFromSwitchIf(struct Message* msg, struct Iface* iface
return NULL;
}
- if (!Bits_memcmp(herKey, sm->cryptoAuth->publicKey, 32)) {
+ if (false && !Bits_memcmp(herKey, sm->cryptoAuth->publicKey, 32)) {
Log_debug(sm->log, "DROP Handshake from 'ourselves'");
return NULL;
}
getPeers seems to work for me now, although the traceroute script gets stuck in infinite loops when i try to traceroute other people.
So I traced the DHTModules and request with self interface path reach this. Then I am lost again.
We might have to go with modified @Arceliar solution.
This seems fixed
If you do
./tools/traceroute [self_cjdns_ip]
the cjdns server will crash with following error in logs:System: Debian Kernel: 2.6/OpenVM Verson: Newest/ 3311d304ddebda9d8eaa1d389905f12cd8990a62