bitwalker / distillery

Simplify deployments in Elixir with OTP releases!
MIT License
2.97k stars 398 forks source link

on_load_function failed for NIF. Best way to start debugging? Things to try? #677

Closed jnatherley closed 1 year ago

jnatherley commented 5 years ago

Steps to reproduce

Sorry guys, i'm completely new to this but i'm looking for a way to start debugging the release process to better understand what maybe failing when launching this application with the NiF i've just included. I've tried the --verbose flag, i'm kind of looking for a checklist of things to try or to read up on to debug the actual issue.

``` 7F19DBDD9FB8:t2:A8:shutdown,H7F19DBDDA028 7F19DBDDA028:t3:A15:failed_to_start_child,AF:kernel_safe_sup,H7F19DBDDA088 7F19DBDDA088:t2:A17:on_load_function_failed,A22:Elixir.Discord.SortedSet.NifBridge ```

Description of issue

bitwalker commented 5 years ago

Something doesn't look quite right with the log output there, so you may be missing context. In general the most common issue is running the release on a platform that differs from where the release was built. Either the CPU architecture is different, the OS is different (namely things like building on macOS and deploying to Linux, but different libc can cause that as well, e.g. building for glibc and deploying with musl).

So first thing is to verify that your build and target match along those lines. If they do, make sure any dynamic libraries that the NIF requires are also available to be loaded and can be found, e.g. in LD_LIBRARY_PATH. If you are still seeing an issue, then next up would be to set -loader_debug in your vm.args and examine the output for additional info about the failure.

Attaching a debugger to the BEAM process is an option as well, but off the top of my head I don't know the best place to break and step through.