Open castilma opened 2 years ago
Try debugging with a non-stripped version of rrdtool. also, please verify that your call does not crash if you replace rrdtool with '/bin/true'.
/bin/true dos not crash.
I built rrdtool from branch v1.8.0. it reports
/opt/rrdtool-1.8.0/bin/rrdupdate /dev/null $(<verylongargs.txt)
RRDtool 1.7.2 Copyright by Tobi Oetiker
Usage: rrdupdate <filename>
[--template|-t ds-name[:ds-name]...]
[--skip-past-updates]
time|N:value[:value...]
at-time@value[:value...]
[ time:value[:value...] ..]
ERROR: mmaping file '/dev/null': Invalid argument
That works, but on current master (b8bdcd47bd2986fc6b36abe71e3d6c7581309220):
$ ./bootstrap
$ ./configure --prefix=/opt/rrdmaster CFLAGS=-g
$ CFLAGS=-g make -j && make install
$ /opt/rrdmaster/bin/rrdupdate /dev/null $(<verylongargs.txt)
Speicherzugriffsfehler
$ gdb bash
...
(gdb) run -c 'exec /opt/rrdmaster/bin/rrdupdate /dev/null $(<verylongargs.txt)'Starting program: /usr/bin/bash -c 'exec /opt/rrdmaster/bin/rrdupdate /dev/null $(<verylongargs.txt)'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[Detaching after fork from child process 722]
process 719 is executing new program: /opt/rrdmaster/bin/rrdupdate
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x000055555557e8c5 in optparse_long (options=<error reading variable: Cannot access memory at address 0x7fffff7fefd8>, longopts=<error reading variable: Cannot access memory at address 0x7fffff7fefd0>, longindex=<error reading variable: Cannot access memory at address 0x7fffff7fefc8>) at optparse.c:223
223 {
(gdb) bt
#0 0x000055555557e8c5 in optparse_long (options=<error reading variable: Cannot access memory at address 0x7fffff7fefd8>,
longopts=<error reading variable: Cannot access memory at address 0x7fffff7fefd0>,
longindex=<error reading variable: Cannot access memory at address 0x7fffff7fefc8>) at optparse.c:223
#1 0x000055555557e9bd in optparse_long (options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0) at optparse.c:237
#2 0x000055555557e9bd in optparse_long (options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0) at optparse.c:237
#3 0x000055555557e9bd in optparse_long (options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0) at optparse.c:237
#4 0x000055555557e9bd in optparse_long (options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0) at optparse.c:237
...
#67580 0x000055555557e9bd in optparse_long (options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0) at optparse.c:237
#67581 0x000055555557e9bd in optparse_long (options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0) at optparse.c:237
#67582 0x0000555555575ece in rrd_update (argc=75856, argv=0x7fffffe2f118) at rrd_update.c:688
#67583 0x0000555555557902 in main (argc=75856, argv=0x7fffffe2f118) at rrdupdate.c:35
I tried using git bisect (which was a bit more difficult because I wasn't sure how to create ./configure at first; ./bootstrap seems to be the way?) but it resulted in fbea00d4 being good, and b8bdcd47 (master) being bad. But since master is just one merge commit ahead of fbea00d4, I think I did something wrong. I might have left some build system files over after switching commits for git bisect, which might have influenced the build outputs. I also noticed that I have a builddir, which I think was build from master, but does not show the problem! Though I don't remember the exact steps I took to build that version...
EDIT: Seems to be a stack overflow due to recursion in optparse.c:optparse_long(), which calls itself in line 237.
(gdb) down
#67581 0x000055555557e9bd in optparse_long (options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0) at optparse.c:237
237 int r = optparse_long(options, longopts, longindex);
(gdb) info f
Stack level 67581, frame at 0x7fffffe2ef00:
rip = 0x55555557e9bd in optparse_long (optparse.c:237); saved rip = 0x555555575ece
called by frame at 0x7fffffe2efe0, caller of frame at 0x7fffffe2eea0
source language c.
Arglist at 0x7fffffe2eef0, args: options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0
Locals at 0x7fffffe2eef0, Previous frame's sp is 0x7fffffe2ef00
Saved registers:
rbp at 0x7fffffe2eef0, rip at 0x7fffffe2eef8
(gdb) down
#67580 0x000055555557e9bd in optparse_long (options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0) at optparse.c:237
237 int r = optparse_long(options, longopts, longindex);
(gdb) info f
Stack level 67580, frame at 0x7fffffe2eea0:
rip = 0x55555557e9bd in optparse_long (optparse.c:237); saved rip = 0x55555557e9bd
called by frame at 0x7fffffe2ef00, caller of frame at 0x7fffffe2ee40
source language c.
Arglist at 0x7fffffe2ee90, args: options=0x7fffffe2ef70, longopts=0x7fffffe2ef30, longindex=0x0
Locals at 0x7fffffe2ee90, Previous frame's sp is 0x7fffffe2eea0
Saved registers:
rbp at 0x7fffffe2ee90, rip at 0x7fffffe2ee98
(gdb)
@castilma What happens, if you run bootstrap also as a first command with v1.8.0, before configure etc.
@castilma What happens, if you run bootstrap also as a first command with v1.8.0, before configure etc.
@c72578 see the new edit of my last comment and tell me if you still want me to try that.
It seems to look for option arguments recursively Using '--' prevents this and the error:
$ /opt/rrdmaster/bin/rrdupdate -- /dev/null $(<verylongargs.txt )
RRDtool 1.8.0 Copyright by Tobi Oetiker
Usage: rrdupdate <filename>
[--template|-t ds-name[:ds-name]...]
[--skip-past-updates]
time|N:value[:value...]
at-time@value[:value...]
[ time:value[:value...] ..]
ERROR: mmaping file '/dev/null': Invalid argument
Now the question is: Why do you have your own optparse code? Can't you use getopt_long or similar? One would hope, a special library would properly handle big argumentlist.
Describe the bug With a very long argument list rrdupdate segfaults.
To Reproduce
verylongargs.txt
dmesg says
Expected behavior
Or an argument list too long error from the shell with exit code 126. I do check for that in my script, but not for segfault (139).
Desktop (please complete the following information):
Additional context I noticed that the segfault becomes more unlikely when the argument list is for example 69245 args long. (
rrdupdate /dev/null $(tr ' ' $'\n' <verylongargs.txt |head -n 69245|paste -s -d' ')
) If I use 69246 arguments, it happens more often than not. The fact that it is unreliable, makes me think it has something to do with the memory layout / ALSR in combination with a buffer or stack overflow.I tried debugging with gdb: