ntop / ntopng

Web-based Traffic and Security Network Traffic Monitoring
http://www.ntop.org
GNU General Public License v3.0
6.28k stars 656 forks source link

Occasional SIGSEGVs in normal operation #398

Closed sthen closed 8 years ago

sthen commented 8 years ago

I'm running ntopng-2.2 on OpenBSD (working on a port). After a while running (varies, sometimes minutes sometimes hours) ntopng is exiting. I've got a backtrace and debug output from one. Any ideas? Note that OpenBSD's malloc has some hardening including use-after-free detection. Thanks.

Core was generated by `ntopng'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  strchr () at /usr/src/lib/libc/arch/amd64/string/strchr.S:58
58              movq    (%rdi),%rax     /* bytes to check (x) */
[Current thread is 1 (process 1191)]
(gdb) bt full
#0  strchr () at /usr/src/lib/libc/arch/amd64/string/strchr.S:58
No locals.
#1  0x000017da66e26541 in std::strchr (
    __s1=0x17dd37a57f80 "\201]5:::{\"name\":\"event\",\"args\":[{\"type\":\"2\",\"y\":0.6662790697674419,\"id\":\"heicmBUT4R-IRDr7s4Em\"}]}y\244\233*7D\031\026\227\345\036\347-\274\236\305\247\227%P\302:\303\256\365~\305)\203\247\031\205\263"<error: Cannot access memory at address 0x17dd37a58000>, __n=32) at /usr/include/g++/cstring:108
No locals.
#2  0x000017da66e26b37 in HTTPStats::incResponse (this=0x17dd63d72b00, 
    return_code=0x17dd37a57f80 "\201]5:::{\"name\":\"event\",\"args\":[{\"type\":\"2\",\"y\":0.6662790697674419,\"id\":\"heicmBUT4R-IRDr7s4Em\"}]}y\244\233*7D\031\026\227\345\036\347-\274\236\305\247\227%P\302:\303\256\365~\305)\203\247\031\205\263"<error: Cannot access memory at address 0x17dd37a58000>) at src/HTTPStats.cpp:206
        code = 0x17dce70f4a80 "\340J\017\347\334\027"
#3  0x000017da66e5dd09 in Flow::dissectHTTP (this=0x17dd34b28800, 
    src2dst_direction=false, 
    payload=0x17dd37a57f80 "\201]5:::{\"name\":\"event\",\"args\":[{\"type\":\"2\",\"y\":0.6662790697674419,\"id\":\"heicmBUT4R-IRDr7s4Em\"}]}y\244\233*7D\031\026\227\345\036\347-\274\236\305\247\227%P\302:\303\256\365~\305)\203\247\031\205\263"<error: Cannot access memory at address 0x17dd37a58000>, payload_len=95) at src/Flow.cpp:1737
        space = 0x17da66e5e9ac <Flow::updateInterfaceStats(bool, unsigned int, unsigned int)+190> "\311\303UH\211\345H\203\354 H\211}\350\211U\340@\210u\344H\213}\350\350.\234\002"
        h = 0x17dd63d72b00
#4  0x000017da66e561b5 in NetworkInterface::packetProcessing (this=0x17dcafa6d610, 
    when=0x17dd4bebb3e8, time=1455558814823, eth=0x17dd37a57f3e, vlan_id=0, 
    iph=0x17dd37a57f4c, ip6=0x0, ipsize=147, rawsize=161, h=0x17dd4bebb3e8, 
    packet=0x17dd37a57f3e "h[5\204\302\211", a_shaper_id=0x17dce70f4fb4, 
    b_shaper_id=0x17dce70f4fb0) at src/NetworkInterface.cpp:826
        ndpi_flow = 0x0
        dump_is_unknown = false
        src2dst_direction = false
        l4_proto = 6 '\006'
        flow = 0x17dd34b28800
        eth_src = 0x17dd37a57f44 ""
        eth_dst = 0x17dd37a57f3e "h[5\204\302\211"
        src_ip = {addr = {ipVersion = 4 '\004', localHost = 0 '\000', 
            privateIP = 0 '\000', multicastIP = 0 '\000', broadcastIP = 0 '\000', 
            notUsed = 0 '\000', ipType = {ipv6 = {u6_addr = {
                  u6_addr8 = "_U%\201", '\000' <repeats 11 times>, u6_addr16 = {21855, 
                    33061, 0, 0, 0, 0, 0, 0}, u6_addr32 = {2166707551, 0, 0, 0}}}, 
              ipv4 = 2166707551}}, ip_key = 1599415681}
        dst_ip = {addr = {ipVersion = 4 '\004', localHost = 0 '\000', 
            privateIP = 1 '\001', multicastIP = 0 '\000', broadcastIP = 0 '\000', 
            notUsed = 0 '\000', ipType = {ipv6 = {u6_addr = {
                  u6_addr8 = "\n\017\005\036", '\000' <repeats 11 times>, u6_addr16 = {
                    3850, 7685, 0, 0, 0, 0, 0, 0}, u6_addr32 = {503648010, 0, 0, 0}}}, 
              ipv4 = 503648010}}, ip_key = 168756510}
        src_port = 23569
        dst_port = 7882
        payload_len = 95
        tcph = 0x17dd37a57f60
        udph = 0x0
        l4_packet_len = 127
        l4 = 0x17dd37a57f60 "\021\\\312\036\277\311\326\355\066a8\260\200\030"
        tcp_flags = 24 '\030'
        payload = 0x17dd37a57f80 "\201]5:::{\"name\":\"event\",\"args\":[{\"type\":\"2\",\"y\":0.6662790697674419,\"id\":\"heicmBUT4R-IRDr7s4Em\"}]}y\244\233*7D\031\026\227\345\036\347-\274\236\305\247\227%P\302:\303\256\365~\305)\203\247\031\205\263"<error: Cannot access memory at address 0x17dd37a58000>
        ip = 0x17dd37a57f4c "E\004"
        is_fragment = false
        new_flow = false
        pass_verdict = true
#5  0x000017da66e57938 in NetworkInterface::packet_dissector (this=0x17dcafa6d610, 
    h=0x17dd4bebb3e8, packet=0x17dd37a57f3e "h[5\204\302\211", 
    a_shaper_id=0x17dce70f4fb4, b_shaper_id=0x17dce70f4fb0)
    at src/NetworkInterface.cpp:1294
        frag_off = 16384
        iph = 0x17dd37a57f4c
        ip6 = 0x0
        srcHost = 0x17dc7aad3000
        dstHost = 0x17dd18458000
        ethernet = 0x17dd37a57f3e
        dummy_ethernet = {h_dest = "&\263\334\027\000", 
          h_source = "1\307\304\252\347", h_proto = 0}
        time = 1455558814823
        eth_type = 2048
        ip_offset = 14
        vlan_id = 0
        eth_offset = 0
        res = 1000
        null_type = 6108
        pcap_datalink_type = 1
        pass_verdict = true
        lasttime = 1455558814823
        oom_warning_sent = false
        oom_warning_sent = false
#6  0x000017da66e18a0c in packetPollLoop (ptr=0x17dcafa6d610)
    at src/PcapInterface.cpp:183
        a = 0
        b = 0
        pkt = 0x17dd37a57f3e "h[5\204\302\211"
        hdr = 0x17dd4bebb3e8
        rc = 1
        iface = 0x17dcafa6d610
        pd = 0x17dd4bebb200
        pcap_list = 0x0
#7  0x000017dd1d0d080e in _rthread_start (v=0x0)
    at /usr/src/lib/librthread/rthread.c:145
        retval = <optimized out>
#8  0x000017dd0c4fa52b in __tfork_thread ()
    at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:75
No locals.
#9  0x0000000000000000 in ?? ()
No symbol table info available.

ntopng-crash-debugoutput.txt

lucaderi commented 8 years ago

Can you please use first the code in git rather than 2.2? Please report if you can reproduce the problem with it.

sthen commented 8 years ago

I'll try to do this, but devel doesn't build directly on OpenBSD, I'll need to figure out some bpf_timeval vs timeval pieces first.

I have hit it again btw, it was json-like http payload this time too.

lucaderi commented 8 years ago

Please send us a patch once you figure out the compilation issue

lucaderi commented 8 years ago

@sthen Any news?

sthen commented 8 years ago

I've got the dev code running on a modified kernel + libpcap running on my laptop and not seen a problem yet, but need to get a router running this code to give it a better workout. I'll try and do this over the next few days and let you know how it goes.

I'm going to try and get things changed re bpf_timeval in OpenBSD because this is a common problem (this all stems back to the time when libpcap dumps weren't compatible between 32/64 bit arches; OpenBSD fixed it earlier than libpcap but did so in a different way..) - I need to dig out a machine from storage to finish my diff though as I tried it a couple of years ago and it broke on sparc64.

lucaderi commented 8 years ago

Closing for inactivity. Shall you have news please contact us