athenavm / athena

Athena monorepo
https://www.athenavm.org/
Apache License 2.0
21 stars 2 forks source link

symbol not found in flat namespace '__end' #215

Closed lrettig closed 1 week ago

lrettig commented 2 weeks ago

161 is back again, this time on the execution side (and only on mac). From https://github.com/spacemeshos/go-spacemesh/actions/runs/11674170614/job/32506374469?pr=6380:

=== FAIL: node TestSpacemeshApp_TransactionService (0.49s)
panic: loading Athena VM: loading library: dlopen(/Users/m1/actions-runner/_work/go-spacemesh/go-spacemesh/build/libathenavmwrapper.dylib, 0x000A): symbol not found in flat namespace '__end' [recovered]
    panic: loading Athena VM: loading library: dlopen(/Users/m1/actions-runner/_work/go-spacemesh/go-spacemesh/build/libathenavmwrapper.dylib, 0x000A): symbol not found in flat namespace '__end'

goroutine 1817 [running]:
testing.tRunner.func1.2({0x106fe5ce0, 0xc0010a4280})
    /Users/m1/actions-runner/_work/_tool/go/1.23.2/arm64/src/testing/testing.go:1632 +0x2c4
testing.tRunner.func1()
    /Users/m1/actions-runner/_work/_tool/go/1.23.2/arm64/src/testing/testing.go:1635 +0x47c
panic({0x106fe5ce0?, 0xc0010a4280?})
    /Users/m1/actions-runner/_work/_tool/go/1.23.2/arm64/src/runtime/panic.go:785 +0x124
github.com/spacemeshos/go-spacemesh/vm/sdk/wallet.Spawn({0xc00059ecc0, 0x40, 0x40}, 0x0, {0xc0006cfe70, 0x1, 0x0?})
    /Users/m1/actions-runner/_work/go-spacemesh/go-spacemesh/vm/sdk/wallet/tx.go:45 +0x5d0
github.com/spacemeshos/go-spacemesh/node.TestSpacemeshApp_TransactionService(0xc00029b[380](https://github.com/spacemeshos/go-spacemesh/actions/runs/11674170614/job/32506374469?pr=6380#step:14:381))
    /Users/m1/actions-runner/_work/go-spacemesh/go-spacemesh/node/node_test.go:554 +0xc98
testing.tRunner(0xc00029b380, 0x1072b89f8)
    /Users/m1/actions-runner/_work/_tool/go/1.23.2/arm64/src/testing/testing.go:1690 +0x188
created by testing.(*T).Run in goroutine 1
    /Users/m1/actions-runner/_work/_tool/go/1.23.2/arm64/src/testing/testing.go:1743 +0x5e4
lrettig commented 2 weeks ago

This issue is very odd. sp1 has precisely the same code, and I can compile it fine on my Mac. I've isolated the problem to this commit: it compiles before this point, but something here broke it: https://github.com/athenavm/athena/pull/145/commits/d540a820aacf7010992dae67ff3827064e6a777d

lrettig commented 2 weeks ago

I isolated it further, to this change: https://github.com/athenavm/athena/pull/145/commits/d540a820aacf7010992dae67ff3827064e6a777d#diff-1a9c17ef3b6a70254124f54ee4a8f2ceb17a3a7f8ed9570473a03e3246860ba4. Sort of.

If you use d540a820aacf7010992dae67ff3827064e6a777d and roll back this change, it compiles. If you apply this change, the error appears.

But rolling back this change on the main branch today doesn't fix the issue.

lrettig commented 2 weeks ago

Note that a macOS/mach-O-specific solution is quite complex:

Here's how you can implement a macOS-specific solution:

Understanding Mach-O and Segments Mach-O Format: macOS uses the Mach-O (Mach Object) file format for executables and libraries. This format organizes code and data into segments and sections. Segments and Sections: Segments are large divisions of a binary (like TEXT, DATA), and segments contain sections (like text, data). Solution Overview Use Mach-O Headers: Access the Mach-O headers to find the __DATA segment and its sections. Find the End of the Data Segment: Calculate the end address of the data segment. Modify Your Code to Use This Address: Use the calculated end address as the starting point for your heap allocator. Implementation Steps

  1. Include Necessary Headers In order to use Mach-O APIs, you'll need to include the following headers:

rust Copy code extern crate libc;

use libc::{c_char, c_void}; use std::ptr; For FFI (Foreign Function Interface) with C functions, you'll need to use extern "C" blocks.

  1. Access the Mach-O Headers You can access the Mach-O headers using functions provided by the dyld API. Since you're working within a dynamic library, you need to get the correct image header.

rust Copy code

[link(name = "dyld")]

extern "C" { fn _dyld_get_image_header(index: u32) -> *const mach_header; fn _dyld_get_image_vmaddr_slide(index: u32) -> isize; fn _dyld_image_count() -> u32; }

[repr(C)]

struct mach_header { magic: u32, cputype: i32, cpusubtype: i32, filetype: u32, ncmds: u32, sizeofcmds: u32, flags: u32, } Note: The actual definitions of the Mach-O structures (mach_header, segment_command_64, etc.) are quite involved. You may need to import or define these structures according to the Mach-O specification.

  1. Find the Data Segment Implement a function that iterates over the load commands in the Mach-O header to find the __DATA segment and determine its end address.

rust Copy code unsafe fn get_data_segment_end() -> usize { use std::mem::size_of;

// Get the number of images (libraries/executables) loaded
let image_count = _dyld_image_count();

// Iterate over the images to find the one corresponding to your library
for i in 0..image_count {
    let header = _dyld_get_image_header(i);
    let slide = _dyld_get_image_vmaddr_slide(i);

    if header.is_null() {
        continue;
    }

    let mut cmd_ptr = (header as *const u8).offset(size_of::<mach_header>() as isize);

    // Iterate over the load commands
    let header_ref = &*header;
    for _ in 0..header_ref.ncmds {
        let cmd = &*(cmd_ptr as *const load_command);

        if cmd.cmd == LC_SEGMENT_64 {
            let seg_cmd = &*(cmd_ptr as *const segment_command_64);
            let segname = &seg_cmd.segname;

            // Check if this is the __DATA segment
            if segname.starts_with(b"__DATA\0") {
                // Calculate the end address of the __DATA segment
                let end_addr = seg_cmd.vmaddr + seg_cmd.vmsize + slide as u64;
                return end_addr as usize;
            }
        }

        cmd_ptr = cmd_ptr.offset(cmd.cmdsize as isize);
    }
}

panic!("__DATA segment not found");

} Definitions Needed:

You'll need to define the following constants and structs according to the Mach-O specification.

rust Copy code const LC_SEGMENT_64: u32 = 0x19;

[repr(C)]

struct load_command { cmd: u32, cmdsize: u32, }

[repr(C)]

struct segment_command_64 { cmd: u32, cmdsize: u32, segname: [u8; 16], vmaddr: u64, vmsize: u64, fileoff: u64, filesize: u64, maxprot: i32, initprot: i32, nsects: u32, flags: u32, }

// Implement a method to check the segment name impl segment_command_64 { fn segname_as_str(&self) -> &str { let cstr = &self.segname; let len = cstr.iter().position(|&c| c == 0).unwrap_or(16); std::str::from_utf8(&cstr[..len]).unwrap_or("") } }

  1. Use the End Address in Your Allocator Modify your allocator function to use the calculated end address:

rust Copy code

[allow(clippy::missing_safety_doc)]

[no_mangle]

pub unsafe extern "C" fn sys_alloc_aligned(bytes: usize, align: usize) -> *mut u8 { // Pointer to next heap address to use, or 0 if the heap has not yet been initialized. static mut HEAP_POS: usize = 0;

// SAFETY: Single threaded, so nothing else can touch this while we're working.
let mut heap_pos = HEAP_POS;

if heap_pos == 0 {
    heap_pos = get_data_segment_end();
}

let offset = heap_pos & (align - 1);
if offset != 0 {
    heap_pos += align - offset;
}

let ptr = heap_pos as *mut u8;
let (heap_pos_new, overflowed) = heap_pos.overflowing_add(bytes);

if overflowed || MAX_MEMORY < heap_pos_new {
    panic!("Memory limit exceeded (0x78000000)");
}

HEAP_POS = heap_pos_new;
ptr

}

  1. Handle Edge Cases and Test Thoroughly Error Handling: Ensure you handle cases where the __DATA segment isn't found. Thread Safety: If your application is multithreaded, you'll need to make HEAP_POS thread-safe. Testing: Thoroughly test your allocator to ensure it behaves correctly on macOS. Important Notes Complexity: Parsing Mach-O headers is complex and can be error-prone. Ensure you understand the Mach-O file format and the structures involved. Maintenance: This solution is macOS-specific and adds complexity to your codebase. Alternatives: Consider whether you can use system memory allocation functions or other cross-platform libraries to handle memory allocation. Additional Considerations Use sbrk(0) on macOS Although sbrk is deprecated on macOS, it can still be used to get the current program break. However, using deprecated functions is generally discouraged.

rust Copy code extern "C" { fn sbrk(increment: isize) -> *mut c_void; }

unsafe fn get_heap_start() -> usize { sbrk(0) as usize } Caution: Using sbrk can lead to undefined behavior and is not recommended for new applications.

Manually Define the _end Symbol You can define the _end symbol in assembly and link it with your Rust code.

Assembly File (end_symbol.s):

assembly Copy code .globl _end _end: Compile the assembly file and link it with your Rust library.

Linking Steps:

bash Copy code as -o end_symbol.o end_symbol.s ar rcs libend_symbol.a end_symbol.o Modify your build script to link libend_symbol.a with your Rust code.

Usage in Rust:

rust Copy code extern "C" { static _end: u8; }

unsafe fn get_heap_start() -> usize { &(_end) as *const u8 as usize } Caution: This approach requires careful handling to ensure that _end is placed correctly in memory.

lrettig commented 2 weeks ago

The best clue I can find is this: https://stackoverflow.com/questions/54368717/undefined-symbols-for-mh-execute-header