Closed lrettig closed 1 week ago
This issue is very odd. sp1
has precisely the same code, and I can compile it fine on my Mac. I've isolated the problem to this commit: it compiles before this point, but something here broke it: https://github.com/athenavm/athena/pull/145/commits/d540a820aacf7010992dae67ff3827064e6a777d
I isolated it further, to this change: https://github.com/athenavm/athena/pull/145/commits/d540a820aacf7010992dae67ff3827064e6a777d#diff-1a9c17ef3b6a70254124f54ee4a8f2ceb17a3a7f8ed9570473a03e3246860ba4. Sort of.
If you use d540a820aacf7010992dae67ff3827064e6a777d and roll back this change, it compiles. If you apply this change, the error appears.
But rolling back this change on the main branch today doesn't fix the issue.
Note that a macOS/mach-O-specific solution is quite complex:
Here's how you can implement a macOS-specific solution:
Understanding Mach-O and Segments Mach-O Format: macOS uses the Mach-O (Mach Object) file format for executables and libraries. This format organizes code and data into segments and sections. Segments and Sections: Segments are large divisions of a binary (like TEXT, DATA), and segments contain sections (like text, data). Solution Overview Use Mach-O Headers: Access the Mach-O headers to find the __DATA segment and its sections. Find the End of the Data Segment: Calculate the end address of the data segment. Modify Your Code to Use This Address: Use the calculated end address as the starting point for your heap allocator. Implementation Steps
rust Copy code extern crate libc;
use libc::{c_char, c_void}; use std::ptr; For FFI (Foreign Function Interface) with C functions, you'll need to use extern "C" blocks.
rust Copy code
extern "C" { fn _dyld_get_image_header(index: u32) -> *const mach_header; fn _dyld_get_image_vmaddr_slide(index: u32) -> isize; fn _dyld_image_count() -> u32; }
struct mach_header { magic: u32, cputype: i32, cpusubtype: i32, filetype: u32, ncmds: u32, sizeofcmds: u32, flags: u32, } Note: The actual definitions of the Mach-O structures (mach_header, segment_command_64, etc.) are quite involved. You may need to import or define these structures according to the Mach-O specification.
rust Copy code unsafe fn get_data_segment_end() -> usize { use std::mem::size_of;
// Get the number of images (libraries/executables) loaded
let image_count = _dyld_image_count();
// Iterate over the images to find the one corresponding to your library
for i in 0..image_count {
let header = _dyld_get_image_header(i);
let slide = _dyld_get_image_vmaddr_slide(i);
if header.is_null() {
continue;
}
let mut cmd_ptr = (header as *const u8).offset(size_of::<mach_header>() as isize);
// Iterate over the load commands
let header_ref = &*header;
for _ in 0..header_ref.ncmds {
let cmd = &*(cmd_ptr as *const load_command);
if cmd.cmd == LC_SEGMENT_64 {
let seg_cmd = &*(cmd_ptr as *const segment_command_64);
let segname = &seg_cmd.segname;
// Check if this is the __DATA segment
if segname.starts_with(b"__DATA\0") {
// Calculate the end address of the __DATA segment
let end_addr = seg_cmd.vmaddr + seg_cmd.vmsize + slide as u64;
return end_addr as usize;
}
}
cmd_ptr = cmd_ptr.offset(cmd.cmdsize as isize);
}
}
panic!("__DATA segment not found");
} Definitions Needed:
You'll need to define the following constants and structs according to the Mach-O specification.
rust Copy code const LC_SEGMENT_64: u32 = 0x19;
struct load_command { cmd: u32, cmdsize: u32, }
struct segment_command_64 { cmd: u32, cmdsize: u32, segname: [u8; 16], vmaddr: u64, vmsize: u64, fileoff: u64, filesize: u64, maxprot: i32, initprot: i32, nsects: u32, flags: u32, }
// Implement a method to check the segment name impl segment_command_64 { fn segname_as_str(&self) -> &str { let cstr = &self.segname; let len = cstr.iter().position(|&c| c == 0).unwrap_or(16); std::str::from_utf8(&cstr[..len]).unwrap_or("") } }
rust Copy code
pub unsafe extern "C" fn sys_alloc_aligned(bytes: usize, align: usize) -> *mut u8 { // Pointer to next heap address to use, or 0 if the heap has not yet been initialized. static mut HEAP_POS: usize = 0;
// SAFETY: Single threaded, so nothing else can touch this while we're working.
let mut heap_pos = HEAP_POS;
if heap_pos == 0 {
heap_pos = get_data_segment_end();
}
let offset = heap_pos & (align - 1);
if offset != 0 {
heap_pos += align - offset;
}
let ptr = heap_pos as *mut u8;
let (heap_pos_new, overflowed) = heap_pos.overflowing_add(bytes);
if overflowed || MAX_MEMORY < heap_pos_new {
panic!("Memory limit exceeded (0x78000000)");
}
HEAP_POS = heap_pos_new;
ptr
}
rust Copy code extern "C" { fn sbrk(increment: isize) -> *mut c_void; }
unsafe fn get_heap_start() -> usize { sbrk(0) as usize } Caution: Using sbrk can lead to undefined behavior and is not recommended for new applications.
Manually Define the _end Symbol You can define the _end symbol in assembly and link it with your Rust code.
Assembly File (end_symbol.s):
assembly Copy code .globl _end _end: Compile the assembly file and link it with your Rust library.
Linking Steps:
bash Copy code as -o end_symbol.o end_symbol.s ar rcs libend_symbol.a end_symbol.o Modify your build script to link libend_symbol.a with your Rust code.
Usage in Rust:
rust Copy code extern "C" { static _end: u8; }
unsafe fn get_heap_start() -> usize { &(_end) as *const u8 as usize } Caution: This approach requires careful handling to ensure that _end is placed correctly in memory.
The best clue I can find is this: https://stackoverflow.com/questions/54368717/undefined-symbols-for-mh-execute-header
161 is back again, this time on the execution side (and only on mac). From https://github.com/spacemeshos/go-spacemesh/actions/runs/11674170614/job/32506374469?pr=6380: