open-simh / simh

The Open SIMH simulators package
https://opensimh.org/
Other
473 stars 89 forks source link

Stack overflow in i650 emulator #382

Closed Rhialto closed 3 months ago

Rhialto commented 3 months ago

Looking at it with gdb, the cause seems weird, since a function is called with NULL, but in the caller this variable looks totally valid and not NULL.

I was wondering if this could be be because I now use gcc 10, which I usually didn't before. This is on NetBSD 10 for amd64. gcc version 10.5.0 (nb3 20231008).

IBM 650 simulator Open SIMH V4.1-0 Current git commit id: 2e301487 sim> show version IBM 650 simulator Open SIMH V4.1-0 Current Simulator Framework Capabilities: 64b data 32b addresses no Ethernet Idle/Throttling support is available Virtual Hard Disk (VHD) support FrontPanel API Version 12 Host Platform: Compiler: GCC 10.5.0 Simulator Compiled as C arch: x64 (Debug Build) on Jun 2 2024 at 21:23:09 Build Tool: simh-makefile Memory Access: Little Endian Memory Pointer Size: 64 bits Large File (>2GB) support SDL Video support: No Video Support PCRE RegEx (Version 8.45 2021-06-15) support for EXPECT commands OS clock resolution: 1ms Time taken by msleep(1): 20ms OS: NetBSD murthe.falu.nl 10.0 NetBSD 10.0 (GENERIC) #0: Sat May 18 22:55:37 CEST 2024 rhialto@murthe.falu.nl:/mnt/scratch/scratch/tmp/xcrash/usr/src/sys/arch/amd64/compile/GENERIC amd64 tar tool: bsdtar 3.4.0 - libarchive 3.4.0 zlib/1.2.13 liblzma/5.2.4 bz2lib/1.0.8 curl tool: curl 8.7.1 (x86_64--netbsd) libcurl/8.7.1 OpenSSL/3.0.12 zlib/1.2.13 libidn2/2.3.7 nghttp2/1.60.0 git commit id: 2e301487 git commit time: 2024-06-02T20:44:37+0200


- #### how you built the simulator or that you're using prebuilt binaries
I built myself from commit 43963943132d393b074ca4f68c0dce707f194b32 but with some small changes to the makefile, which are reflected in the listed build command (add /usr/X11R7/lib and include if present, use libpng16, use -Wl,-R instead of -R).

murthe.8:.../cvs/other/open-simh$ rm BIN/i650 remove 'BIN/i650'? y murthe.8:.../cvs/other/open-simh$ gmake DEBUG=1 TESTS=0 i650 lib paths are: /usr/lib /usr/pkg/lib /usr/X11R7/lib /lib/ /usr/lib/ include paths are: /usr/include/gcc-10 /usr/include /usr/pkg/include /usr/X11R7/include using libm: /usr/lib/libm.so using librt: /usr/lib/librt.so using libpthread: /usr/lib/libpthread.so /usr/include/pthread.h using libpcre: /usr/pkg/lib/libpcre.so /usr/pkg/include/pcre.h using semaphore: /usr/include/semaphore.h using libdl: /usr/include/dlfcn.h using libedit: /usr/pkg/include/editline/readline.h using mman: /usr/include/sys/mman.h


i650 Simulator being built with: - debugging support. GCC Version: 10.5.0. *** - Per simulator tests will be skipped.


git commit id is 2e30148775f32b64c667672d7baf6f5d893a057c. git commit time is 2024-06-02T20:44:37+0200.


gcc -std=gnu99 -U__STRICT_ANSI__ -g -ggdb -g3 -D_DEBUG=1 -O0 -DSIM_GIT_COMMIT_ID=2e30148775f32b64c667672d7baf6f5d893a057c -DSIM_GIT_COMMIT_TIME=2024-06-02T20:44:37+0200 -DSIM_COMPILER="GCC Version: 10.5.0" -DSIM_BUILD_TOOL=simh-makefile -I . -Werror -D_GNU_SOURCE -I/usr/pkg/include -I/usr/X11R7/include -DHAVE_PCRE_H -DHAVE_SEMAPHORE -DHAVE_SYS_IOCTL -DSIM_HAVE_DLOPEN=so -DHAVE_EDITLINE -DHAVE_UTIME -DHAVE_GLOB -DHAVE_SHM_OPEN ./I650/i650_cpu.c ./I650/i650_cdr.c ./I650/i650_cdp.c ./I650/i650_dsk.c ./I650/i650_mt.c ./I650/i650_sys.c ./scp.c ./sim_console.c ./sim_fio.c ./sim_timer.c ./sim_sock.c ./sim_tmxr.c ./sim_ether.c ./sim_tape.c ./sim_disk.c ./sim_serial.c ./sim_video.c ./sim_imd.c ./sim_card.c -I ./I650 -DUSE_INT64 -DUSE_SIM_CARD -o BIN/i650 -L/usr/pkg/lib -Wl,-R/usr/pkg/lib -L/usr/X11R7/lib -Wl,-R/usr/X11R7/lib -lm -lrt -lpthread -lpcre -L/usr/pkg/lib/ -ledit -ltermcap

- #### the simulator configuration file (or commands) which were used when the problem occurred.
`gdb --args BIN/i650 RegisterSanityCheck /mnt/vol1/rhialto/cvs/other/open-simh/I650/tests/i650_test.ini`

- #### the expected behavior and the actual behavior
<!--- Please provide the output the simulator produced when you experienced the problem -->
Test passes.
Actual:

murthe.8:.../cvs/other/open-simh$ gdb --args BIN/i650 RegisterSanityCheck /mnt/vol1/rhialto/cvs/other/open-simh/I650/tests/i650_test.ini GNU gdb (GDB) 11.0.50.20200914-git Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64--netbsd". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.

For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from BIN/i650... (gdb) r Starting program: /mnt/vol1/rhialto/cvs/other/open-simh/BIN/i650 RegisterSanityCheck /mnt/vol1/rhialto/cvs/other/open-simh/I650/tests/i650_test.ini Running internal register sanity checks on IBM 650 simulator. *** Good Registers in IBM 650 simulator.

IBM 650 simulator Open SIMH V4.1-0 Current git commit id: 2e301487 [New process 6724] Logging to file "console.txt"

** IBM 650: Basic Instruction Test: /mnt/vol1/rhialto/cvs/other/open-simh/I650/sw/run_fds.ini-30> go %SIM-INFO: SIGINT will be delivered to your debugger when the ^F character is entered 00 0000 0001 50 1000 0000 50 1000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0002 50 2000 0000 50 1414 2135 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0003 50 3000 0000 50 1732 0508 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0004 50 4000 0000 50 2000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0005 50 5000 0000 50 2236 0679 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0006 50 6000 0000 50 2449 4897 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0007 50 7000 0000 50 2645 7513 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0008 50 8000 0000 50 2828 4271 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0009 50 9000 0000 50 3000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000 00 0000 0000

Address Error, IC: 09999 FDS Ok ** Test: passed.

** IBM 650: Floating Point Instruction Test:

C RECTANGULAR MATRIX C MULTIPLICATION DIMENSION A(4,5), B(5,3) READ 1,A,B READ 1,N,M,L 7 DO 4 J=1,N 1 DO 4 I=1,M 6 SUM=0.0 2 DO 3 K=1,L 3 SUM = SUM+A(I,K) * B(K,J) 4 PUNCH 1, SUM, I,J 8 END


*** Load FORTRANSIT translator deck into drum


I/O Error, IC: 08000 ( 7019519999+ RD 1951 9999 )

I/O Error, IC: 08000 ( 7019519999+ RD 1951 9999 )


*** Run FORTRANSIT translator


                                      T200001T15 0021             DF    0000
                                      T36T3 7T38                  DF    0000

7+ 4KI39 K 1 K 0007 7+ 1 KI36 K F 0007 1+ 4KI40 K 1 K 0001 1+ 1 KI37 K F 0001 6+ Y41Z0 J0 F 0006 2+ 3KI42 K 1 K 0002 2+ 1 KI38 K F 0002 3+ Y41ZY41SYLLM4RSL4XI42RSI40RXYL 0003 3+ 15SL5XI39RSI42R F 0003 4+ T41T4 0T39 F 0004 8+ FF 0008

I/O Error, IC: 01999 ( 7019520228+ RD 1952 0228 )

Thread 1 "" received signal SIGSEGV, Segmentation fault. 0x000000000041f5a5 in deck_split_cmd (cptr=0x0) at ./I650/i650_sys.c:831 831 { (gdb) bt

0 0x000000000041f5a5 in deck_split_cmd (cptr=0x0) at ./I650/i650_sys.c:831

1 0x0000000000420859 in ibm650_deck_cmd (arg=0,

buf=0x7f7fff341699 "-q split -1 cdp1 deck_it.dck deck_it_header.dck")
at ./I650/i650_sys.c:1175

2 0x0000000000426e98 in do_cmd_label (flag=2,

fcptr=0x7f7fff344043 "run_fortransit.ini  fortransit/fortransit_example_2_src.txt  fortransit/fortransit_example_2_data.txt", label=0x0) at ./scp.c:4220

3 0x000000000042622a in do_cmd (flag=2,

fcptr=0x7f7fff344043 "run_fortransit.ini  fortransit/fortransit_example_2_src.txt  fortransit/fortransit_example_2_data.txt") at ./scp.c:4044

4 0x0000000000426e4a in do_cmd_label (flag=0,

fcptr=0x7f7fff345700 "/mnt/vol1/rhialto/cvs/other/open-simh/I650/tests/i650_test.ini", label=0x0) at ./scp.c:4214

5 0x000000000042622a in do_cmd (flag=0,

fcptr=0x7f7fff345700 "/mnt/vol1/rhialto/cvs/other/open-simh/I650/tests/i650_test.ini") at ./scp.c:4044

6 0x000000000042163d in main (argc=2, argv=0x7a8cdf722000) at ./scp.c:2948

(gdb)

Note how deck_split_cmd() seems to be called with NULL but checking at the call site

(gdb) up

1 0x0000000000420859 in ibm650_deck_cmd (arg=0,

buf=0x7f7fff341699 "-q split -1 cdp1 deck_it.dck deck_it_header.dck")
at ./I650/i650_sys.c:1175

1175 return deck_split_cmd(cptr); (gdb) print cptr $1 = 0x7f7fff3416a2 "-1 cdp1 deck_it.dck deck_it_header.dck"



Things I have considered:
- Incorrectly declared function. But it isn't; the called function is in the same file as the caller, and before it.
- Stack overflow. Threads have smaller stacks than you think, and deck_split_cmd declares some largeish buffers. However this seems to be the main thread, since main() is in the call stack, so this is probably not an issue.
- ???
Other ideas?
Rhialto commented 3 months ago

Stack size was nagging me.... Hopefully the rsp values that the debugger shows me are correct:

(gdb) info registers rsp
rsp            0x7f7fff176a30      0x7f7fff176a30
(gdb) down
#0  0x000000000041f5a5 in deck_split_cmd (cptr=0x0) at ./I650/i650_sys.c:831
831     {
(gdb) info registers rsp
rsp            0x7f7ffecde3a0      0x7f7ffecde3a0
(gdb) print 0x7f7fff176a30 - 0x7f7ffecde3a0
$1 = 4818576

How can that call use more than 4 MB of stack size? Those arrays were big but I didn't think that big... Stack size is limited by default to 4 MB:

$ ulimit -a
stack size                  (kbytes, -s) 4096

but after $ ulimit -Ss 8192 the test passes.

So, why are there such big buffers on the stack? I would call excessive, and even that a bug.

Rhialto commented 3 months ago

There seem to be 5 biggish buffers in play, at least locally:

char gbuf[4*CBUFSIZE]; // in ibm650_deck_cmd
char fn0[4*CBUFSIZE]; // in deck_split_cmd
char fn1[4*CBUFSIZE];
char fn2[4*CBUFSIZE];
char gbuf[4*CBUFSIZE];

CBUFSIZE is #define CBUFSIZE (128 + PATH_MAX) and PATH_MAX is probably 4096. So in total 5 * 4 * (128 + PATH_MAX) is 84480 - which is a lot, but not even approaching a megabyte.

What would explain the rest of the stack usage?

Rhialto commented 3 months ago

So yes, there are large arrays on the stack:

       uint16 DeckImage[80 * MAX_CARDS_IN_DECK];
       uint16 DeckImage1[80 * MAX_CARDS_IN_DECK];
       uint16 DeckImage2[80 * MAX_CARDS_IN_DECK]; 

where MAX_CARDS_IN_DECK is 10 000. So each array is 2 * 80 * 10 000 bytes = 1 600 000 bytes...

I made a MR to allocate these dynamically: #383 .