Closed mbuczko closed 5 months ago
same problem:(
Breakpoint 2, ztest::main () at src/main.rs:5
5 let ctx = zmq::Context::new();
(gdb) step
zmq::Context::new () at /usr/local/cargo/registry/src/mirrors.sjtug.sjtu.edu.cn-7a04d2510079875b/zmq-0.9.2/src/lib.rs:434
434 ctx: unsafe { zmq_sys::zmq_ctx_new() },
(gdb) step
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
Same problem here. Any interest in solving this problem? This is a huge issue for me right now.
Additionally, this seems to happen on any docker image, not just alpine. Tried this on ubuntu as well with no luck
@Jasper-Bekkers Any clue here? I'm trying to use this in production but I can't release it until this is fixed. This one's a pretty huge deal. Any help is appreciated.
This looks a bit like it's happening inside zmq directly, the current release uses a very old release of zmq to bind against.
https://github.com/erickt/rust-zmq/pull/345 is the path forward - it upgrades some things (as well as the zmq library), so potentially it has some fixes that apply to you as well.
My first instinct was some kind of ffi error but since there are no parameters to Context::new
I don't think this is the case, so you'll have to step into zmq_ctx_new
to really debug what's going on with your access violation. Another guess would be a libc mixup if alpine is on musl potentially?
I tried the zmq2 library and I get the same error with it, so I don't think it has to do with the zmq version.
Too bad! In that case I suspect you'll need to debug this a bit yourself, I'm happy to review a fix if you have one.
Alright, well i'm posting a bounty to bountysource then since this is important to me but i'm not quite sure how to solve it.
Hi. I would like to chase this bug and solve it ASAP. Can I reproduce it in docker?
I couldn't reproduce it myself.
It should be easy to reproduce in docker by just building and then running it. It won't happen at build time, it happens at runtime.
Edit: Feel free to email me at my email posted on my account and i can see about setting up a demo
I have created an example for this issue: https://github.com/daniel-brenot-apcapital/example-zmq
It seems that the dynamic linker is not operating correctly, and this is an issue with the rust compiler installed by rustup
, it is not an issue with the rust compiler packaged by alpine linux.
This where the segfault is happening:
(gdb) disas
Dump of assembler code for function _ZnwmRKSt9nothrow_t@plt:
=> 0x00007ffff7ee3050 <+0>: jmp QWORD PTR [rip+0x1184fa] # 0x7ffff7ffb550 <_ZnwmRKSt9nothrow_t@got.plt>
0x00007ffff7ee3056 <+6>: push 0x3
0x00007ffff7ee305b <+11>: jmp 0x7ffff7ee3010
(gdb) x/a 0x7ffff7ffb550
0x7ffff7ffb550 <_ZnwmRKSt9nothrow_t@got.plt>: 0x1c056
(gdb) i sharedlibrary
No shared libraries loaded at this time.
The segfault happens in the PLT stub for operator new
which is a dynamically linked function from libstdc++
. The linker should have resolved this function and replaced the placeholder address 0x1c056
with the actual address of operator new
from libstdc++
. Also, gdb says No shared libraries loaded at this time.
which suggests that the dynamic linker was not able to load any shared libraries.
In contrast, this is the same output from a correct binary (which I will explain how to build):
(gdb) disas
Dump of assembler code for function _ZnwmRKSt9nothrow_t@plt:
=> 0x000055555556e070 <+0>: jmp *0xf52ca(%rip) # 0x555555663340 <_ZnwmRKSt9nothrow_t@got.plt>
0x000055555556e076 <+6>: push $0x5
0x000055555556e07b <+11>: jmp 0x55555556e010
(gdb) x/a 0x555555663340
0x555555663340 <_ZnwmRKSt9nothrow_t@got.plt>: 0x7ffff7e4f6ed <_ZnwmRKSt9nothrow_t>
(gdb) i sharedlibrary
From To Syms Read Shared Object Library
0x00007ffff7f7c070 0x00007ffff7fc3877 Yes (*) /lib/ld-musl-x86_64.so.1
0x00007ffff7e4bae0 0x00007ffff7ee54a1 Yes (*) /usr/lib/libstdc++.so.6
If this problem is blocking you in production, use these instructions to to solve it know:
Install and use the rust compiler provided by the rust
package from the official alpine linux repository and do not use rustup
.
This is a modified dockerfile based on @daniel-brenot-apcapital's test sample which will not produce a segfault
FROM alpine AS builder
RUN apk add g++ rust cargo
ENV CARGO_HOME=/usr/local/cargo
WORKDIR /home/user/src
COPY . .
RUN CXXFLAGS=-DZMQ_HAVE_STRLCPY cargo install --path .
FROM alpine AS deploy
RUN apk add libstdc++
COPY --from=builder /usr/local/cargo/bin/ .
This is a similar dockerfile which will produce a segfault. The only difference is that the rust compiler is installed by rustup
FROM alpine AS builder
RUN apk add g++ rustup
ENV CARGO_HOME=/usr/local/cargo \
PATH=/usr/local/cargo/bin:$PATH
RUN rustup-init -y --no-modify-path --profile minimal
WORKDIR /home/user/src
COPY . .
RUN CXXFLAGS=-DZMQ_HAVE_STRLCPY cargo install --path .
FROM alpine AS deploy
RUN apk add libstdc++
COPY --from=builder /usr/local/cargo/bin/ .
I will look further to see why the dynamic linker does not work when rustc
is installed by rustup
.
The reason why the compiler packaged by alpine works is that it does not statically link by default, while vanilla rust does statically link by default for musl
targets.
The issue with static linking is the fact that cc
which compiles the zeromq
C++ code links libstdc++
dynamically, which is a reasonable choice because the cross compilation toolchain provided by rustup
does not have a cross compiled libstdc++
. But if you really want to statically link libstdc++
you can setup .cargo/config
where you tell cargo where to look for a cross compiled libstdc++.a
for your target.
The solution to this issue is to create a cargo config file .cargo/config
in your package directory with the following content:
[target.x86_64-unknown-linux-musl]
rustflags = ["-C", "target-feature=-crt-static"]
This fixes the segfault induced by dynamic linking.
Otherwise if you need to use a static libstdc++
you can change the content of the cargo config to this:
[target.x86_64-unknown-linux-musl]
rustflags = ["-L", "native=/usr/lib", "-l", "static=stdc++"]
This will statically link libstdc++
which should be located in /usr/lib
. Note that this static library must be (cross) compiled for your target. If you are building on an alpine host then /usr/lib/libstdc++.a
is OK.
This is great. It's a holiday where i'm from right now, but on Monday i'll create a PR with an update to the readme to let people who run into this problem know about this solution, or you may do so yourself. Once that's done i'm sure the issue can be closed and the bounty will be sent. Thank you for all your hard work!
Can we close this so this guy gets the bounty? I don't have that email account anymore and this guy solved the problem for me back then
Sure thing!
Hey, I'm trying to run as simple code as possible in
alpine:latest
container:Cargo.toml:
and it painfully dies with core dump:
There is something weird happening during context creation which leads to the core dump. I tried to strace it a bit, but still no idea what's the root cause:
Rustc details:
(fails with alpine packaged rust too). libzmq installed from alpine package (
apk add libzmq
), though I tried with self compiled version too - same result.Any hint what might be the root cause of this problem?