smlnj / legacy

This project is the old version of Standard ML of New Jersey that continues to support older systems (e.g., 32-bit machines).
BSD 3-Clause "New" or "Revised" License
25 stars 10 forks source link

Foreign function gives segmentation fault only when called from SML/NJ #291

Closed pclayton closed 2 weeks ago

pclayton commented 5 months ago

Version

110.99.4 (Latest)

Operating System

OS Version

Linux fedora 6.2.15-100.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 11 16:51:53 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Processor

System Component

Foreign-Function Interface (FFI)

Severity

Major

Description

Using 32-bit SML/NJ, which is required for the FFI, we have found that a foreign function called from SML/NJ gives a segmentation fault but works fine when called otherwise, e.g. via the MLton FFI.

Transcript

[user@hostname sml]$ sml -32
Standard ML of New Jersey (32-bit) v110.99.4 [built: Fri Jan 26 09:05:35 2024]
- CM.make "z3.cm";
[autoloading]
[library $smlnj/cm/cm.cm is stable]
[library $smlnj/internal/cm-sig-lib.cm is stable]
[library $/pgraph.cm is stable]
[library $smlnj/internal/srcpath-lib.cm is stable]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
[scanning z3.cm]
[library $c/c.cm is stable]
[scanning (z3.cm):z3-ffi.cm]
[attempting to load plugin $/make-tool.cm]
[library $/make-tool.cm is stable]
[library $smlnj/internal/cm-lib.cm is stable]
$Execute: required privileges are:
  cm-init
[library $smlnj/cm/tools.cm is stable]
[plugin $/make-tool.cm loaded successfully]
[make -f Makefile FFI/z3.cm SMLNJ_BINDIR=/opt/smlnj/smlnj-110.99.4/bin FFI/z3.cm]
make: 'FFI/z3.cm' is up to date.
make: 'FFI/z3.cm' is up to date.
[scanning (z3.cm):(z3-ffi.cm):FFI/z3.cm]
[library $c/internals/c-int.cm is stable]
[library $c/memory/memory.cm is stable]
...
$Execute: required privileges are:
  c-int
  primitive
[New bindings added.]
val it = true : bool
- Z3.Config.config ();
/opt/smlnj/smlnj-110.99.4/bin/sml: Fatal error -- bogus fault not in ML: pc = 0xf242eb62, sig = 11

Expected Behavior

No segmentation fault should occur.

Steps to Reproduce

Briefly, as I have already investigated this:

  1. Build a 32-bit release version of Z3, say 4.8.12.
  2. Create bindings using ML-NLFFI.
  3. Call Z3_mk_config.

Additional Information

The segmentation fault occurs when the following MOVAPS instruction (indicated by =>) is executed:

(gdb) disass
Dump of assembler code for function _ZN15prime_generator22process_next_k_numbersEy:
   ...
   0xf071eb5f <+479>:   mov    %ebx,-0x7c(%ebp)
=> 0xf071eb62 <+482>:   movaps %xmm0,-0x68(%ebp)
   0xf071eb66 <+486>:   movd   %xmm0,-0x70(%ebp)
   ...

This MOVAPS copies %xmm0 to an address on the stack given by -0x68(%ebp) and requires the address to be aligned on a 16-byte boundary. However we have

(gdb) print $ebp
$3 = (void *) 0xffffa900

which means that -0x68(%ebp) is not aligned on a 16-byte boundary. The next instruction causes the processor to generate a general-protection exception (#GP), which the OS traps:

(gdb) ni

Program received signal SIGSEGV, Segmentation fault.
0xf071eb62 in prime_generator::process_next_k_numbers(unsigned long long) () from /tmp/test_z3_ffi/z3-z3-4.8.12/build32/libz3.so

-0x68(%ebp) is misaligned here because esp on entry to the function Z3_mk_config from SML/NJ is not aligned to a 16-byte boundary. Restarting and breaking just inside Z3_mk_config we have:

Breakpoint 1, 0xf1b420d4 in Z3_mk_config () from /tmp/test_z3_ffi/z3-z3-4.8.12/build32/libz3.so
(gdb) disass
Dump of assembler code for function Z3_mk_config:
   0xf1b420d0 <+0>: push   %ebp
   0xf1b420d1 <+1>: mov    %esp,%ebp
   0xf1b420d3 <+3>: push   %edi
=> 0xf1b420d4 <+4>: call   0xf06b16d6 <__x86.get_pc_thunk.di>
   ...

(gdb) print $ebp
$1 = (void *) 0xffffa9d0
(gdb) print $esp
$2 = (void *) 0xffffa9cc

The value of esp before

So it appears that SML/NJ is calling Z3_mk_config with esp equal to 0xffffa9d8 which is not aligned on a 16-byte boundary. This is misaligned by 8 bytes, the same amount that -0x68(%ebp) was misaligned in the MOVAPS instruction.

GCC introduced a 16-byte alignment requirement for the x86 ABI some years ago presumably to allow faster SSE instructions that require alignment to a 16-byte boundary. See this useful summary for more background on this requirement.

Email address

phil.clayton@veonix.com

pclayton commented 5 months ago

I am able to work around this issue by specifying -march=i686 (in addition to -m32) when building Z3.

JohnReppy commented 5 months ago

I'll take a look at it. It may be easy to fix, since I believe that we already require 16-byte alignment for 32-bit macOS (see base/compiler/TopLevel/backend/x86-ccall.sml).

pclayton commented 2 weeks ago

Fixed in 110.99.5.