mlua-rs / mlua

High level Lua 5.4/5.3/5.2/5.1 (including LuaJIT) and Roblox Luau bindings to Rust with async/await support
Other
1.65k stars 135 forks source link

Unexpected seg faults and killed 9 when loading module built for aarch64-apple-darwin #82

Closed chipsenkbeil closed 3 years ago

chipsenkbeil commented 3 years ago

I'm trying to diagnose this, but it seems like when I cross-build an aarch64-apple-darwin module from the Github MacOS 11.0 runner, which is x86_64-apple-darwin, the built module causes a seg fault when loaded on my M1 Mac.

Is there anything I can look into to figure out why this is? When I build it directly on my M1 Mac, it works just fine.

I've got automated github workflow that builds all of my assets including the Lua module for Windows, Linux, and both Intel and ARM macs, and didn't realize this was happening until I tried to download the module and load it.

image
khvzak commented 3 years ago

Could you try to cross-compile module using lua51 to aarch64 and then load on our mac? They must be binary compatible.

chipsenkbeil commented 3 years ago

That's actually what I was doing since they were binary compatible. I'm trying to load the lua51 module in luajit, although loading using lua 5.1.5 has the same problem.

Check out my MacOS job under release.yml: https://github.com/chipsenkbeil/distant/blob/master/.github/workflows/release.yml

I've tried using the non-fat aarch64 dylib and the unified version.

khvzak commented 3 years ago

Could you attach please glue.rs file geneated on aarch64 mac (not cross compilation). It's located in target/debug/build/mlua-*/out/glue.rs

chipsenkbeil commented 3 years ago

From (cd distant-lua && cargo build) on my M1 Mac with lua51 and vendored enabled along with other features.

/* this file was generated by glue.c; do not modify it by hand */
use std::os::raw::*;
/* luaconf.h */
pub const LUA_EXTRASPACE: c_int = 8;
pub const LUA_IDSIZE: c_int = 60;
pub type LUA_NUMBER = c_double;
pub type LUA_INTEGER = i64;
pub type LUA_UNSIGNED = u64;
/* lua.h */
pub const LUA_VERSION_NUM: c_int = 501;
pub const LUA_REGISTRYINDEX: c_int = -10000;
pub const LUA_ENVIRONINDEX: c_int = -10001;
pub const LUA_GLOBALSINDEX: c_int = -10002;
/* lauxlib.h */
/* lualib.h */
pub const LUA_COLIBNAME: &str = "coroutine";
pub const LUA_TABLIBNAME: &str = "table";
pub const LUA_IOLIBNAME: &str = "io";
pub const LUA_OSLIBNAME: &str = "os";
pub const LUA_STRLIBNAME: &str = "string";
pub const LUA_MATHLIBNAME: &str = "math";
pub const LUA_DBLIBNAME: &str = "debug";
pub const LUA_LOADLIBNAME: &str = "package";
chipsenkbeil commented 3 years ago

I managed to export glue.rs cross-built to aarch64-apple-darwin via the Github worker macos-11.0 for a sample project (https://github.com/chipsenkbeil/mlua-release-test) and it looks like this:

/* This file was generated by build/main.rs; do not modify by hand */
use std::os::raw::*;
/* luaconf.h */
pub const LUA_EXTRASPACE: c_int = 64 / 8;
pub const LUA_IDSIZE: c_int = 60;
pub type LUA_NUMBER = c_double;
pub type LUA_INTEGER = i64;
pub type LUA_UNSIGNED = u64;
/* lua.h */
pub const LUA_VERSION_NUM: c_int = 501;
pub const LUA_REGISTRYINDEX: c_int = -1000000 - 1000;
pub const LUA_ENVIRONINDEX: c_int = -10001;
pub const LUA_GLOBALSINDEX: c_int = -10002;
/* lauxlib.h */
pub const LUAL_NUMSIZES: c_int = std::mem::size_of::<LUA_INTEGER>() as c_int * 16 + std::mem::size_of::<LUA_NUMBER>() as c_int;
/* lualib.h */

#[cfg(feature = "luajit")]
pub const LUA_BITLIBNAME: &str = "bit";
#[cfg(not(feature = "luajit"))]
pub const LUA_BITLIBNAME: &str = "bit32";

pub const LUA_COLIBNAME: &str = "coroutine";
pub const LUA_DBLIBNAME: &str = "debug";
pub const LUA_IOLIBNAME: &str = "io";
pub const LUA_LOADLIBNAME: &str = "package";
pub const LUA_MATHLIBNAME: &str = "math";
pub const LUA_OSLIBNAME: &str = "os";
pub const LUA_STRLIBNAME: &str = "string";
pub const LUA_TABLIBNAME: &str = "table";
pub const LUA_UTF8LIBNAME: &str = "utf8";

pub const LUA_JITLIBNAME: &str = "jit";
pub const LUA_FFILIBNAME: &str = "ffi";
khvzak commented 3 years ago

Thanks! LUA_REGISTRYINDEX is definitely wrong for Lua 5.1/jit. I pushed a fix 458b06796c39d3c04e7c4c250f84ccf7f5958106 Could you try please?

chipsenkbeil commented 3 years ago
  cd distant-lua
  cargo build --release --no-default-features --features "lua51,vendored" --target aarch64-apple-darwin
  ls -l ../target/aarch64-apple-darwin/release
  cp ../target/aarch64-apple-darwin/release/libdistant_lua.dylib ../distant_lua-macos-arm.dylib
  shell: /bin/bash -e {0}
  env:
    LUA_VERSION: 5.1.5
    LUA_FEATURE: lua51
    UPLOAD_NAME: macos
    X86_ARCH: x86_64-apple-darwin
    ARM_ARCH: aarch64-apple-darwin
    X86_DIR: target/x86_64-apple-darwin/release
    ARM_DIR: target/aarch64-apple-darwin/release
    BUILD_BIN: distant
    BUILD_LIB: libdistant_lua.dylib
    UNIVERSAL_REL_BIN: distant-macos
    UNIVERSAL_REL_LIB: distant_lua-macos.dylib
    X86_REL_LIB: distant_lua-macos-intel.dylib
    ARM_REL_LIB: distant_lua-macos-arm.dylib
    CACHE_ON_FAILURE: false
    CARGO_INCREMENTAL: 0
   Compiling cfg-if v1.0.0
   Compiling pin-project-lite v0.2.7
   Compiling futures-io v0.3.17
   Compiling once_cell v1.8.0
   Compiling slab v0.4.4
   Compiling cache-padded v1.1.1
   Compiling fastrand v1.5.0
   Compiling parking v2.0.0
   Compiling waker-fn v1.1.0
   Compiling smallvec v1.7.0
   Compiling scopeguard v1.1.0
   Compiling event-listener v2.5.1
   Compiling futures-sink v0.3.17
   Compiling async-task v4.0.3
   Compiling pin-utils v0.1.0
   Compiling atomic-waker v1.0.0
   Compiling subtle v2.4.1
   Compiling bitflags v1.3.2
   Compiling foreign-types-shared v0.1.1
   Compiling same-file v1.0.6
   Compiling bytes v1.1.0
   Compiling ppv-lite86 v0.2.10
   Compiling regex-automata v0.1.10
   Compiling lazy_static v1.4.0
   Compiling regex-syntax v0.6.25
   Compiling zeroize v1.4.2
   Compiling opaque-debug v0.3.0
   Compiling half v1.7.1
   Compiling serial-unix v0.4.0
   Compiling dirs-next v2.0.0
   Compiling async-io v1.6.0
   Compiling cipher v0.3.0
   Compiling universal-hash v0.4.1
   Compiling aead v0.4.3
   Compiling chrono v0.4.19
   Compiling async-fs v1.5.0
   Compiling filenamegen v0.2.4
   Compiling rand_chacha v0.3.1
   Compiling serial v0.4.0
   Compiling async-process v1.2.0
   Compiling async-net v1.6.1
   Compiling chacha20 v0.8.1
   Compiling poly1305 v0.7.2
   Compiling serde v1.0.130
   Compiling futures-util v0.3.17
   Compiling tokio v1.12.0
   Compiling thiserror v1.0.29
   Compiling strum v0.21.0
   Compiling simplelog v0.10.2
   Compiling rand v0.8.4
   Compiling smol v1.2.5
   Compiling chacha20poly1305 v0.9.0
   Compiling serde_cbor v0.11.2
   Compiling erased-serde v0.3.16
   Compiling futures-executor v0.3.17
   Compiling filedescriptor v0.8.1 (https://github.com/chipsenkbeil/wezterm#f25fbb73)
   Compiling tokio-util v0.6.8
   Compiling async-compat v0.2.1
error[E0428]: the name `LUA_ENVIRONINDEX` is defined multiple times
  --> /Users/runner/work/distant/distant/target/aarch64-apple-darwin/release/build/mlua-7799cd0a28e696f4/out/glue.rs:12:1
   |
11 | pub const LUA_ENVIRONINDEX: c_int = -10000;
   | ------------------------------------------- previous definition of the value `LUA_ENVIRONINDEX` here
12 | pub const LUA_ENVIRONINDEX: c_int = -10001;
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `LUA_ENVIRONINDEX` redefined here
   |
   = note: `LUA_ENVIRONINDEX` must be defined only once in the value namespace of this module

error[E0432]: unresolved import `self::lua::LUA_REGISTRYINDEX`
   --> /Users/runner/.cargo/git/checkouts/mlua-4fd10c345875f70c/ebf296f/src/ffi/mod.rs:216:80
    |
216 |     LUA_OPLE, LUA_OPLT, LUA_OPMOD, LUA_OPMUL, LUA_OPPOW, LUA_OPSUB, LUA_OPUNM, LUA_REGISTRYINDEX,
    |                                                                                ^^^^^^^^^^^^^^^^^ no `LUA_REGISTRYINDEX` in `ffi::lua`

error[E0432]: unresolved import `super::glue::LUA_REGISTRYINDEX`
  --> /Users/runner/.cargo/git/checkouts/mlua-4fd10c345875f70c/ebf296f/src/ffi/lua.rs:37:23
   |
37 | pub use super::glue::{LUA_REGISTRYINDEX, LUA_VERSION_NUM};
   |                       ^^^^^^^^^^^^^^^^^ no `LUA_REGISTRYINDEX` in `ffi::glue`

   Compiling futures v0.3.17
Some errors have detailed explanations: E0428, E0432.
For more information about an error, try `rustc --explain E0428`.
error: could not compile `mlua` due to 3 previous errors
warning: build failed, waiting for other jobs to finish...
error: build failed
Error: Process completed with exit code 101.
khvzak commented 3 years ago

Sorry I pushed another fix after getting this error from CI

chipsenkbeil commented 3 years ago
image

This is the error I see with my test repo for 0.6.5 as well, so not sure if this is unrelated.

I'm also seeing errors trying to compile my separate test crate on x86_64 for MacOS and Linux now: https://github.com/chipsenkbeil/distant/runs/3842709684

khvzak commented 3 years ago

I see you added module feature in https://github.com/chipsenkbeil/distant/commit/a6393bae80a486567046a9e3eafa305bb05ff8f3 to distant-lua-tests. This is why you getting link error.

chipsenkbeil commented 3 years ago

I see you added module feature in chipsenkbeil/distant@a6393ba to distant-lua-tests. This is why you getting link error.

Oh, whoops! You're right πŸ˜† Copy-and-paste error when I was grabbing the commit from distant-lua/Cargo.toml

chipsenkbeil commented 3 years ago

@khvzak fixed the tests per your callout, but that still leaves the "file too short" error. My Internet has gone out at my house, so I'm using my phone. Earlier, before your latest commit, I had a friend airdrop me the library from their Intel mac to my M1 mac and got the seg fault; so, this isn't just tied to the Github action from what I can see.

And to recap, the test library I made was giving me the "file too short" error prior to your commit whereas my distant-lua project was giving a seg fault but was recognized by lipo -info as aarch64. With your latest commit, the cross-build is giving me the "file too short" error for distant-lua. Not sure how to pin this one down.

khvzak commented 3 years ago

Thanks, seems I need to find M1 for testing. By the way, your (binary) libs are unsigned and would not work on macos without manual approval via "system prefenreces" 😞 I just tried to load the latest "macos-intel" image.

chipsenkbeil commented 3 years ago

Ah! Yes, I ran into that. You can actually clear it via xattr -d com.apple.quarantine distant_lua.so. This also only happens when you manually download the lib through a browser or transfer over airdrop. For my neovim plugin, I provide a facility to download the lib using curl, which avoids the quarantine flag being set and it works just fine. :) Well, to avoid the approval. It was giving me a seg fault in neovim, luajit, and lua-5.1 once I actually tried to load it. :P

If you're feeling adventurous, I'd be happy to create a guest account on my M1 Mac Mini and let you ssh into it to test things.

khvzak commented 3 years ago

If you're feeling adventurous, I'd be happy to create a guest account on my M1 Mac Mini and let you ssh into it to test things.

Thanks, it would be helpful! I cannot find any public CI thats offers m1 :( AWS also does not offers ec2 arm macs. Could you write to me email zxteam[at]protonmail.com

it’s quite unexpected to get segfault when cross-compiling to aarach64 darwin. Modules do not have any C code linked. It should be generic issue.

I see that aarch64-apple-darwin has tier2 support currently (https://doc.rust-lang.org/nightly/rustc/platform-support.html)

chipsenkbeil commented 3 years ago

@khvzak sent you an email.

As for the tier2, I'm aware :D For me, as long as I can get it working in some way - aka building natively on M1 - then I'm okay. This just stops me from being able to automatically build and release the Lua module for M1 Macs. This does seem to be related to mlua in some way as the cli binary cross-built from x86_64 to aarch64 works just fine when I download it.

chipsenkbeil commented 3 years ago

@khvzak thanks for your help here! I tried pointing directly to the latest git version and this time it worked! So I must have had Cargo.lock getting in the way or something else going on. Anyway, once we get a new release out for the fix, we can close this task. :)

khvzak commented 3 years ago

v0.6.6 released