odin-lang / Odin

Odin Programming Language
https://odin-lang.org
BSD 3-Clause "New" or "Revised" License
6.95k stars 615 forks source link

os._alloc_command_line_arguments segfaults when mixing optimization flags. #3913

Open nobodyspecial1553 opened 4 months ago

nobodyspecial1553 commented 4 months ago

Context

Current Behavior

Program segfaults when both using os.args and calling a foreign procedure from a shared object compiled with different optimization flags from the executable.

Steps to Reproduce

  1. Write module with a procedure
    
    package hellope

import "core:fmt"

@(export) print_hellope :: proc() { fmt.println("Hellope!") }


2. Write application to interface with it, call the procedure and use os.args in some way

package main

import "core:os"

foreign import hellope "hellope.so"

foreign hellope { print_hellope :: proc() --- }

main :: proc() { _ = len(os.args) print_hellope() }

3. Compile hellope with `-build-mode:shared -opt:speed` and main with `-opt:none` thereby mixing the optimizations
4. Run the program to encounter segfault

### Failure Logs

Program received signal SIGSEGV, Segmentation fault. 0x00000000004011d8 in os._alloc_command_line_arguments () at Odin/core/os/os_linux.odin:1025


Return code: 139
Kelimion commented 4 months ago

Trying to rule something out: Does this go away if you replace fmt.println with runtime.print* in the shared library, whether on default calling convention, c or contextless?

nobodyspecial1553 commented 4 months ago

Just to clarify, this was the original procedure I discovered this with:

@(export)
calculate_distance :: proc "contextless" (x0, y0, x1, y1: f64, radius: f64 = 6372.8) -> f64 {
    lat1, lat2: f64 = y0, y1
    lon1, lon2: f64 = x0, x1

    dlat: f64 = math.to_radians(lat2 - lat1)
    dlon: f64 = math.to_radians(lon2 - lon1)
    lat1 = math.to_radians(lat1)
    lat2 = math.to_radians(lat2)

    a: f64 = square(math.sin(dlat / 2.0)) + math.cos(lat1) * math.cos(lat2) * square(math.sin(dlon / 2.0))
    c: f64 = 2.0 * math.asin(math.sqrt(a))

    return radius * c
}

Segfaults on this too. Don't think it has anything to do with that. But I can still try if you want.

I just simplified it for the example.

Kelimion commented 4 months ago

No, that's fine. Interesting bug. Thanks for the find and report both.

jasonKercher commented 4 months ago

I stepped through this in gdb, and it looks like the context pointer gets clobbered. This appears to fix the issue:

diff --git a/base/runtime/entry_unix.odin b/base/runtime/entry_unix.odin
index 7d7252625..fa277150d 100644
--- a/base/runtime/entry_unix.odin
+++ b/base/runtime/entry_unix.odin
@@ -8,7 +8,7 @@ import "base:intrinsics"
 when ODIN_BUILD_MODE == .Dynamic {
        @(link_name="_odin_entry_point", linkage="strong", require/*, link_section=".init"*/)
        _odin_entry_point :: proc "c" () {
-               context = default_context()
+               context = #force_no_inline default_context()
                #force_no_inline _startup_runtime()
                intrinsics.__entry_point()
        }
laytan commented 4 months ago

Probably another amd64sysv abi issue related to #3817, #3762 and probably more issues