grobian / carbon-c-relay

Enhanced C implementation of Carbon relay, aggregator and rewriter
Apache License 2.0
380 stars 107 forks source link

master (e749ae) seg faults on receiving any metric #254

Closed StephenPCG closed 7 years ago

StephenPCG commented 7 years ago

Version: e749aee (current master) config file (this time I just copy and paste without any modification):

cluster default
    forward
        127.0.0.1:2103
    ;

match *
    send to
        default
    stop
    ;

I started a docker container from centos:7 to have a clean environment. Inside the container, I cloned code, compile and run, the output:

[root@4a8ea09f8ca0 carbon-c-relay]# ./relay -f /tmp/relay.conf 
[2017-03-20 02:45:33] starting carbon-c-relay v2.6 (e749ae), pid=441
configuration:
    relay hostname = 4a8ea09f8ca0
    listen port = 2003
    workers = 4
    send batch size = 2500
    server queue size = 25000
    server max stalls = 4
    listen backlog = 32
    server connection IO timeout = 600ms
    configuration = /tmp/relay.conf

parsed configuration follows:
statistics
    submit every 60 seconds
    prefix with carbon.relays.4a8ea09f8ca0
    ;

cluster default
    forward
        127.0.0.1:2103
    ;

match *
    send to default
    stop
    ;

[2017-03-20 02:45:33] listening on tcp4 0.0.0.0 port 2003
[2017-03-20 02:45:33] listening on tcp6 :: port 2003
[2017-03-20 02:45:33] listening on udp4 0.0.0.0 port 2003
[2017-03-20 02:45:33] listening on udp6 :: port 2003
[2017-03-20 02:45:33] listening on UNIX socket /tmp/.s.carbon-c-relay.2003
[2017-03-20 02:45:33] starting 4 workers
[2017-03-20 02:45:33] starting statistics collector
[2017-03-20 02:45:33] starting servers
Segmentation fault (core dumped)

Segmentation fault happens when I send a metric with the following command:

echo "test.random.int ${RANDOM} $(date +%s)" | nc $comtainer-ip 2003
grobian commented 7 years ago

ok, wow

grobian commented 7 years ago

Any chance you could send me a stacktrace? I can't seem to reproduce this.

StephenPCG commented 7 years ago

I'm glad to, but I am not very familiar with C, can you give me some instructions on how to print the stacktrace? I'm familiar with linux tools, you can just give me some commands or tell me what modifications I need to make to the code, I can master it.

If you use docker, you can try to reproduce in this way:

$ cat Dockerfile 
FROM centos:7
MAINTAINER Stephen Zhang <stephenpcg@gmail.com>

RUN yum install -y make gcc git \
    && cd /root \
    && git clone -b master --depth 1 https://github.com/grobian/carbon-c-relay.git \
    && cd carbon-c-relay \
    && make relay
CMD ["/root/carbon-c-relay/relay", "-f", "/relay.conf"]

$ cat relay.conf
cluster default
    forward
        127.0.0.1:1234
    ;

match *
    send to
        default
    stop
    ;

$ docker built -t carbon .
$ docker run -it -v $PWD/relay.conf:/relay.conf -p 2003:2003 carbon

And then send some metric in. It crashes here.

grobian commented 7 years ago

try this: make CFLAGS="-g -O0 -pipe" relay Then run it under gdb like this: gdb --args /root/carbon-c-relay/relay -f /relay.conf (basically add "gdb", "--args" at the front of your CMD array) Then when run from gdb, trigger the crash and run thread all apply bt

StephenPCG commented 7 years ago

Here's the output:

$ gdb --args ./relay -f /relay.conf
...
(gdb) r
...
[New Thread 0x7ffff63ea700 (LWP 128)]
[2017-03-21 13:47:05] starting 4 workers
[New Thread 0x7ffff5be9700 (LWP 129)]
[New Thread 0x7ffff53e8700 (LWP 130)]
[New Thread 0x7ffff4be7700 (LWP 131)]
[New Thread 0x7ffff43e6700 (LWP 132)]
[New Thread 0x7ffff3be5700 (LWP 133)]
[2017-03-21 13:47:05] starting statistics collector
[New Thread 0x7ffff33e4700 (LWP 134)]
[2017-03-21 13:47:05] starting servers
[New Thread 0x7ffff2be3700 (LWP 135)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff2be3700 (LWP 135)]
__GI_getaddrinfo (name=0x62b430 "127.0.0.1", service=0x7ffff2be2e40 "1234", hints=0x62b930000000, pai=0x7ffff2be2e48)
    at ../sysdeps/posix/getaddrinfo.c:2344
2344      if (hints->ai_flags
(gdb) 
grobian commented 7 years ago

thanks, can you try my latest commit to see if that helps?

StephenPCG commented 7 years ago

It's working now!

grobian commented 7 years ago

cool :)