grobian / carbon-c-relay

Enhanced C implementation of Carbon relay, aggregator and rewriter
Apache License 2.0
380 stars 107 forks source link

Metrics get lost when lz4 compressed. #440

Closed lexx-bright closed 2 years ago

lexx-bright commented 2 years ago
~ $git clone https://github.com/grobian/carbon-c-relay.git; cd carbon-c-relay
Cloning into 'carbon-c-relay'...
remote: Enumerating objects: 5761, done.
remote: Counting objects: 100% (88/88), done.
remote: Compressing objects: 100% (65/65), done.
remote: Total 5761 (delta 42), reused 51 (delta 23), pack-reused 5673
Receiving objects: 100% (5761/5761), 3.57 MiB | 1.84 MiB/s, done.
Resolving deltas: 100% (3954/3954), done.

~/carbon-c-relay $./configure --with-lz4
...

~/carbon-c-relay $/usr/local/bin/autoreconf -f -i .
aclocal: warning: couldn't open directory 'm4': No such file or directory
libtoolize: putting auxiliary files in `.'.
libtoolize: copying file `./ltmain.sh'
libtoolize: putting macros in `m4'.
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.ac and
libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree.

~/carbon-c-relay $make
/bin/sh ./config.status --recheck
running CONFIG_SHELL=/bin/sh /bin/sh ./configure --with-lz4 --no-create --no-recursion
checking for a BSD-compatible install... /bin/install -c
checking whether build environment is sane... yes
checking for a race-free mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether to enable maintainer-specific portions of Makefiles... yes
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking how to print strings... printf
checking for style of include used by make... GNU
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether the compiler supports GNU C... yes
checking whether gcc accepts -g... yes
checking for gcc option to enable C11 features... -std=gnu11
checking dependency style of gcc -std=gnu11... gcc3
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc -std=gnu11... /bin/ld
checking if the linker (/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /bin/nm -B
checking the name lister (/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to convert x86_64-pc-linux-gnu file names to x86_64-pc-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-pc-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @FILE support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /bin/nm -B output from gcc -std=gnu11 object... ok
checking for sysroot... no
checking for mt... no
checking if : is a manifest tool... no
checking for stdio.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for strings.h... yes
checking for sys/stat.h... yes
checking for sys/types.h... yes
checking for unistd.h... yes
checking for vfork.h... no
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc -std=gnu11 supports -fno-rtti -fno-exceptions... no
checking for gcc -std=gnu11 option to produce PIC... -fPIC -DPIC
checking if gcc -std=gnu11 PIC flag -fPIC -DPIC works... yes
checking if gcc -std=gnu11 static flag -static works... yes
checking if gcc -std=gnu11 supports -c -o file.o... yes
checking if gcc -std=gnu11 supports -c -o file.o... (cached) yes
checking whether the gcc -std=gnu11 linker (/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking for gcc... (cached) gcc
checking whether the compiler supports GNU C... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking for gcc option to enable C11 features... (cached) -std=gnu11
checking dependency style of gcc -std=gnu11... (cached) gcc3
checking whether make sets $(MAKE)... (cached) yes
checking for library containing sqrt... -lm
checking for library containing gethostbyname... none required
checking for library containing socket... none required
checking for library containing dlsym... -ldl
checking for library containing pthread_create... -lpthread
checking for arpa/inet.h... yes
checking for assert.h... yes
checking for errno.h... yes
checking for fcntl.h... yes
checking for glob.h... yes
checking for math.h... yes
checking for netdb.h... yes
checking for netinet/in.h... yes
checking for netinet/tcp.h... yes
checking for poll.h... yes
checking for pthread.h... yes
checking for regex.h... yes
checking for signal.h... yes
checking for stdarg.h... yes
checking for stdio.h... (cached) yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for sys/resource.h... yes
checking for sys/socket.h... yes
checking for sys/stat.h... (cached) yes
checking for sys/time.h... yes
checking for sys/types.h... (cached) yes
checking for sys/uio.h... yes
checking for sys/un.h... yes
checking for time.h... yes
checking for unistd.h... (cached) yes
checking for dispatch/dispatch.h... no
checking for dlfcn.h... (cached) yes
checking for semaphore.h... yes
checking for inline... inline
checking size of time_t... 8
checking for pid_t... yes
checking for fork... yes
checking for vfork... yes
checking for working fork... yes
checking for working vfork... (cached) yes
checking for GNU libc compatible malloc... yes
checking for GNU libc compatible realloc... yes
checking for gethostname... yes
checking for gettimeofday... yes
checking for localtime_r... yes
checking for memmove... yes
checking for memset... yes
checking for pow... yes
checking for regcomp... yes
checking for socket... yes
checking for sqrt... yes
checking for strchr... yes
checking for strdup... yes
checking for strerror... yes
checking for strstr... yes
checking for strtol... yes
checking for dlsym... yes
checking whether dlsym(RTLD_NEXT, ...) is available... no
checking whether dlsym(RTLD_NEXT, ...) is available using _GNU_SOURCE... yes
checking for zlib.h... yes
checking for gzopen in -lz... yes
checking for lz4.h... yes
checking for lz4frame.h... yes
checking for LZ4_createStream in -llz4... yes
checking for snappy-c.h... yes
checking for snappy_compress in -lsnappy... yes
checking for openssl/err.h... yes
checking for openssl/ssl.h... yes
checking for SSL_connect in -lssl... yes
checking for ERR_reason_error_string in -lcrypto... yes
checking for onigposix.h... no
checking for regexec in -lonig_missing_header... no
checking for pcre2posix.h... no
checking for regexec in -lpcre2-posix_missing_header... no
checking for pcreposix.h... yes
checking for regexec in -lpcreposix... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
 /bin/sh ./config.status
config.status: creating Makefile
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
make  all-am
make[1]: Entering directory `/root/carbon-c-relay'
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT faketime.lo -MD -MP -MF .deps/faketime.Tpo -c -o faketime.lo faketime.c
libtool: compile:  gcc -std=gnu11 -DHAVE_CONFIG_H -I. -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT faketime.lo -MD -MP -MF .deps/faketime.Tpo -c faketime.c  -fPIC -DPIC -o .libs/faketime.o
libtool: compile:  gcc -std=gnu11 -DHAVE_CONFIG_H -I. -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT faketime.lo -MD -MP -MF .deps/faketime.Tpo -c faketime.c -o faketime.o >/dev/null 2>&1
mv -f .deps/faketime.Tpo .deps/faketime.Plo
/bin/sh ./libtool  --tag=CC   --mode=link gcc -std=gnu11  -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread   -o libfaketime.la -rpath /usr/local/lib faketime.lo  -ldl -lm  -pthread
libtool: link: gcc -std=gnu11 -shared  -fPIC -DPIC  .libs/faketime.o   -ldl -lm  -O2 -pthread -pthread   -pthread -Wl,-soname -Wl,libfaketime.so.0 -o .libs/libfaketime.so.0.0.0
libtool: link: (cd ".libs" && rm -f "libfaketime.so.0" && ln -s "libfaketime.so.0.0.0" "libfaketime.so.0")
libtool: link: (cd ".libs" && rm -f "libfaketime.so" && ln -s "libfaketime.so.0.0.0" "libfaketime.so")
libtool: link: ar cru .libs/libfaketime.a  faketime.o
libtool: link: ranlib .libs/libfaketime.a
libtool: link: ( cd ".libs" && rm -f "libfaketime.la" && ln -s "../libfaketime.la" "libfaketime.la" )
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT relay.o -MD -MP -MF .deps/relay.Tpo -c -o relay.o relay.c
mv -f .deps/relay.Tpo .deps/relay.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT md5.o -MD -MP -MF .deps/md5.Tpo -c -o md5.o md5.c
mv -f .deps/md5.Tpo .deps/md5.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT consistent-hash.o -MD -MP -MF .deps/consistent-hash.Tpo -c -o consistent-hash.o consistent-hash.c
mv -f .deps/consistent-hash.Tpo .deps/consistent-hash.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT receptor.o -MD -MP -MF .deps/receptor.Tpo -c -o receptor.o receptor.c
mv -f .deps/receptor.Tpo .deps/receptor.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT dispatcher.o -MD -MP -MF .deps/dispatcher.Tpo -c -o dispatcher.o dispatcher.c
mv -f .deps/dispatcher.Tpo .deps/dispatcher.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT conffile.tab.o -MD -MP -MF .deps/conffile.tab.Tpo -c -o conffile.tab.o conffile.tab.c
mv -f .deps/conffile.tab.Tpo .deps/conffile.tab.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT conffile.yy.o -MD -MP -MF .deps/conffile.yy.Tpo -c -o conffile.yy.o conffile.yy.c
mv -f .deps/conffile.yy.Tpo .deps/conffile.yy.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT allocator.o -MD -MP -MF .deps/allocator.Tpo -c -o allocator.o allocator.c
mv -f .deps/allocator.Tpo .deps/allocator.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT router.o -MD -MP -MF .deps/router.Tpo -c -o router.o router.c
mv -f .deps/router.Tpo .deps/router.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT queue.o -MD -MP -MF .deps/queue.Tpo -c -o queue.o queue.c
mv -f .deps/queue.Tpo .deps/queue.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT server.o -MD -MP -MF .deps/server.Tpo -c -o server.o server.c
mv -f .deps/server.Tpo .deps/server.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT collector.o -MD -MP -MF .deps/collector.Tpo -c -o collector.o collector.c
mv -f .deps/collector.Tpo .deps/collector.Po
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT aggregator.o -MD -MP -MF .deps/aggregator.Tpo -c -o aggregator.o aggregator.c
mv -f .deps/aggregator.Tpo .deps/aggregator.Po
gcc -std=gnu11   -o relay relay.o md5.o consistent-hash.o receptor.o dispatcher.o conffile.tab.o conffile.yy.o allocator.o router.o queue.o server.o collector.o aggregator.o -lz -llz4 -lsnappy -lssl -lcrypto   -lpcreposix -ldl -lm  -pthread
make[1]: Leaving directory `/root/carbon-c-relay'

~/carbon-c-relay $make test
make  sendmetric
make[1]: Entering directory `/root/carbon-c-relay'
gcc -std=gnu11 -DHAVE_CONFIG_H -I.     -g -O2 -D_GNU_SOURCE -DGIT_VERSION=\"bc719b-dirty\" -pthread -MT sendmetric.o -MD -MP -MF .deps/sendmetric.Tpo -c -o sendmetric.o sendmetric.c
mv -f .deps/sendmetric.Tpo .deps/sendmetric.Po
gcc -std=gnu11   -o sendmetric sendmetric.o -lz -llz4 -lsnappy -lssl -lcrypto -ldl -lm  -pthread
make[1]: Leaving directory `/root/carbon-c-relay'
make  check-local
make[1]: Entering directory `/root/carbon-c-relay'
generating datasets ... done
issue10: PASS
issue27: PASS
issue117: PASS
issue156: PASS
issue157: PASS
issue163: PASS
issue165: PASS
issue180: PASS
issue184: PASS
issue202: PASS
issue213: PASS
issue218: PASS
issue228: PASS
issue235: PASS
issue236: PASS
issue246: PASS
issue252: PASS
issue253: PASS
issue263: PASS
issue267: PASS
issue288: PASS
issue293: PASS
issue310: PASS
issue357: PASS
issue369: PASS
server-type: PASS
basic: relay 1 make[1]: Leaving directory `/root/carbon-c-relay'

~/carbon-c-relay $cd test/; git diff run-test.sh
diff --git a/test/run-test.sh b/test/run-test.sh
index a759608..44732b7 100755
--- a/test/run-test.sh
+++ b/test/run-test.sh
@@ -289,6 +289,7 @@ run_servertest() {
        # allow everything to be processed
        sleep 2

+        exit 0
        # kill and wait for relay to come down
        local pids=$(< "${pidfile}")
        [[ ${mode} == DUAL ]] && pids+=" $(< "${pidfile2}")"

~/carbon-c-relay/test $./run-test.sh large-lz4
generating datasets ... done
large-lz4: relay 1 relay 2 ~/carbon-c-relay/test $

~/carbon-c-relay/test $ps -ef | grep relay | grep -v grep
root      4589     1  0 01:08 ?        00:00:00 ../relay -d -w 1 -f /tmp/tmp.9z4scjwOsh/conf-1 -Htest.hostname -s -D -l /tmp/tmp.9z4scjwOsh/relay-1.out -P /tmp/tmp.9z4scjwOsh/pidfile-1
root      4598     1  0 01:08 ?        00:00:00 ../relay -d -w 1 -f /tmp/tmp.9z4scjwOsh/conf-2 -Htest.hostname -s -D -l /tmp/tmp.9z4scjwOsh/relay-2.out -P /tmp/tmp.9z4scjwOsh/pidfile-2

~/carbon-c-relay/test $cat /tmp/tmp.9z4scjwOsh/relay-2.out
[1981-01-31 18:00:00] (MSG) starting carbon-c-relay v3.7.3 (bc719b-dirty), pid=4598
configuration:
    relay hostname = test.hostname
    workers = 1
    send batch size = 2500
    server queue size = 25000
    server max stalls = 4
    listen backlog = 32
    server connection IO timeout = 600ms
    idle connections disconnect timeout = 10m
    debug = true
    configuration = /tmp/tmp.9z4scjwOsh/conf-2

parsed configuration follows:
listen
    type linemode
        /tmp/tmp.9z4scjwOsh/sock.3021 proto unix
    ;

statistics
    submit every 60 seconds
    prefix with carbon.relays.test_hostname
    ;

cluster default
    file
        /tmp/tmp.9z4scjwOsh/data.out
    ;
cluster lz4
    forward
        127.0.0.1:3020 transport lz4
    ;

rewrite ^compress\.(.*)
    into through-compress.\1
    ;
match ^through-compress\.
    send to lz4
    stop
    ;

[1981-01-31 18:00:00] (MSG) listening on UNIX socket /tmp/tmp.9z4scjwOsh/sock.3021
[1981-01-31 18:00:00] (MSG) starting 1 workers
[1981-01-31 18:00:00] (MSG) starting statistics collector
[1981-01-31 18:00:00] (MSG) starting servers
[1981-01-31 18:00:00] (MSG) startup sequence complete

~/carbon-c-relay/test $cat /tmp/tmp.9z4scjwOsh/relay-1.out
[1981-01-31 18:00:00] (MSG) starting carbon-c-relay v3.7.3 (bc719b-dirty), pid=4589
configuration:
    relay hostname = test.hostname
    workers = 1
    send batch size = 2500
    server queue size = 25000
    server max stalls = 4
    listen backlog = 32
    server connection IO timeout = 600ms
    idle connections disconnect timeout = 10m
    debug = true
    configuration = /tmp/tmp.9z4scjwOsh/conf-1

parsed configuration follows:
listen
    type linemode
        /tmp/tmp.9z4scjwOsh/sock.3020 proto unix
    type linemode transport lz4
        127.0.0.1:3020 proto tcp
    ;

statistics
    submit every 60 seconds
    prefix with carbon.relays.test_hostname
    ;

cluster default
    file
        /tmp/tmp.9z4scjwOsh/data.out
    ;

match ^through-compress\.
    send to default
    ;

[1981-01-31 18:00:00] (MSG) listening on UNIX socket /tmp/tmp.9z4scjwOsh/sock.3020
[1981-01-31 18:00:00] (MSG) listening on tcp4 127.0.0.1 port 3020
[1981-01-31 18:00:00] (MSG) starting 1 workers
[1981-01-31 18:00:00] (MSG) starting statistics collector
[1981-01-31 18:00:00] (MSG) starting servers
[1981-01-31 18:00:00] (MSG) startup sequence complete

~/carbon-c-relay/test $wc -l /tmp/tmp.9z4scjwOsh/data.out; while :; do ../sendmetric /tmp/tmp.9z4scjwOsh/sock.3021 < large-lz4.payload; sleep 5; wc -l /tmp/tmp.9z4scjwOsh/data.out; done
10000 /tmp/tmp.9z4scjwOsh/data.out
16267 /tmp/tmp.9z4scjwOsh/data.out
26267 /tmp/tmp.9z4scjwOsh/data.out
36267 /tmp/tmp.9z4scjwOsh/data.out
46267 /tmp/tmp.9z4scjwOsh/data.out
56267 /tmp/tmp.9z4scjwOsh/data.out
62534 /tmp/tmp.9z4scjwOsh/data.out
70362 /tmp/tmp.9z4scjwOsh/data.out
79750 /tmp/tmp.9z4scjwOsh/data.out
89750 /tmp/tmp.9z4scjwOsh/data.out
99750 /tmp/tmp.9z4scjwOsh/data.out
109750 /tmp/tmp.9z4scjwOsh/data.out
grobian commented 2 years ago

I think this is because we don't use blocks or frames, which is probably a shared problem for lzo, lz4 and snappy

grobian commented 2 years ago

Thanks, merged your patch!