Closed CallMeSH closed 2 months ago
I haven't seen this before. My first comment would be to mention +JPperf, but you already tackled that. You also said that this fails when building the Docker image directly on the Linux/amd64 machine, right?
So the only next step I can think of is to run the application and build a release on a Linux/amd64 machine without using Docker at all. If the issue persists, then it is an ADBC issue (either here or the parent one). If it works without Docker, then it is something Docker or QEMU related.
Thanks for your reply @josevalim,
Following your message:
So it seems very QEMU or Docker related...
If you are building for the same architecture you are on, it should not use emulation. Assuming you are invoking the same commands, I can’t really explain why you are seeing different behaviors.
Hum, so your comment made me think about this whole QEMU situation.
Since I was on the cloud instance and not on my mac, QEMU was not involved.
To make sur it was not, I removed from the dockerfile the JPERF env var. ENV ERL_FLAGS="+JPperf true"
.
It built without Jperf so it definitely didn't run in a QEMU context.
For good mesure I opted out of BUILDKIT as well, with the same result: it hangs.
Sooooo that must be something related to adbc in a docker context 🤔
EDIT: I noticed I can enter the BREAK mode when I CTRL-C into the -it mode of docker. So the BEAM is alive, I just don't know how I could gather useful info from there
EDIT 2: Found this promising bit in the proc info
y(1) [standard_io,[<<"Failed to load nif: {:load_failed, ~c\"Failed to load NIF library: '/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /app/_build/prod/rel/adbc_docker_sample_project/lib/adbc-0.6.0/priv/adbc_nif.so)'\"}">>,10]]
Ok so after much investigation, here is what I found: The archived dependencies made with the the CI of this repo run on ubuntu:latest container from GitHub which is an Ubuntu 24.04. Ubuntu comes bundled with GLIBCXX_3.4.29, but the debian image included in the Dockerfile generated by Phoenix does not. By switching the base image from debian-bullseye to ubuntu-noble, I managed to run the project !
Should we update the documentation of adbc to mention somewhere that a certain version of the distributions is required to boot the app? (The issue itself can be considered closed, but I'd like to help in making the information more available 👍 )
Great finding. Can you please try using debian "bookworm" and let us know if it works? Then please send a PR saying that it requires either ubuntu-noble or debian-bullseye. We will see if we can lower the requirement here.
We should also try to update it in Phoenix.
I tried building on bookworm but the mix local.rebar --force
step crashes on the image hexpm/elixir:1.17.2-erlang-26.0.2-debian-bookworm-20240812-slim
> [builder 4/17] RUN mix local.hex --force && mix local.rebar --force:
0.496 =CRASH REPORT==== 19-Aug-2024::17:38:36.151062 ===
0.496 crasher:
0.496 initial call: application_master:init/4
0.496 pid: <0.86.0>
0.496 registered_name: []
0.496 exception exit: {bad_return,
0.496 {{elixir,start,[normal,[]]},
0.496 {'EXIT',
0.496 {undef,
0.496 [{erlang,nif_error,
0.496 [undef],
0.496 [{error_info,#{module => erl_erts_errors}}]},
0.496 {prim_tty,isatty,1,
0.496 [{file,"prim_tty.erl"},{line,1249}]},
0.496 {elixir,start,2,[{file,"src/elixir.erl"},{line,63}]},
0.496 {application_master,start_it_old,4,
0.496 [{file,"application_master.erl"},{line,293}]}]}}}}
0.496 in function application_master:init/4 (application_master.erl, line 142)
0.496 ancestors: [<0.85.0>]
0.496 message_queue_len: 1
0.496 messages: [{'EXIT',<0.87.0>,normal}]
0.496 links: [<0.85.0>,<0.44.0>]
0.496 dictionary: []
0.496 trap_exit: true
0.496 status: running
0.496 heap_size: 987
0.496 stack_size: 28
0.496 reductions: 220
0.496 neighbours:
0.496
0.496 =INFO REPORT==== 19-Aug-2024::17:38:36.160739 ===
0.496 application: elixir
0.496 exited: {bad_return,
0.496 {{elixir,start,[normal,[]]},
0.496 {'EXIT',
0.496 {undef,
0.496 [{erlang,nif_error,
0.496 [undef],
0.496 [{error_info,#{module => erl_erts_errors}}]},
0.496 {prim_tty,isatty,1,
0.496 [{file,"prim_tty.erl"},{line,1249}]},
0.496 {elixir,start,2,[{file,"src/elixir.erl"},{line,63}]},
0.496 {application_master,start_it_old,4,
0.496 [{file,"application_master.erl"},
0.496 {line,293}]}]}}}}
0.496 type: temporary
0.496
0.496 =INFO REPORT==== 19-Aug-2024::17:38:36.161959 ===
0.496 application: compiler
0.496 exited: stopped
0.496 type: temporary
0.496
0.497 Runtime terminating during boot ({{badmatch,{error,{elixir,{_}}}},[{elixir,start_cli,0,[{_},{_}]},{init,start_em,1,[]},{init,do_boot,3,[]}]})
0.498
0.498 Crash dump is being written to: erl_crash.dump...done
------
Dockerfile:31
--------------------
30 | # install hex + rebar
31 | >>> RUN mix local.hex --force && \
32 | >>> mix local.rebar --force
33 |
--------------------
After looking into the runner-images of github, ubuntu:latest is in fact jammy, I just tried it and it works as well.
Here is the recap of what works or does not:
I think alpine should be fine since it uses libmuslc rather than glibc, but my pacman foo is not good enough to try it right now.
As I'm writing this, @cocoa-xu has already prepared a draft PR that will fix this issue I think: https://github.com/elixir-explorer/adbc/pull/100
I will investigate why Elixir is not compiling on Debian but I think #100 will indeed address this. Thank you!
Hello everyone! I've recently encountered an issue when using ADBC inside a Linux amd64 Docker container.
While applications build successfully, running the container results in an indefinite wait with no logs. I've attempted to request "more logs" from the Phoenix project, but without success. Interestingly, the container runs fine on an Apple computer with arm64 architecture. I've tried both building from an amd64 Linux machine in the cloud and cross-compiling from my Mac, but the issue persists. After exhausting various options, I'm turning to the issues to ask for your help .
I've prepared a sample project that reproduces the problem: https://github.com/CallMeSH/ADBC-Docker-sample-project
Here is how to get the problem, starting from a new Phoenix project:
mix phx.nex
mix phx.gen.release --docker
docker build . --platform=linux/amd64
apt-get
phaseENV ERL_FLAGS="+JPperf true"
in the dockerfile fixes the issue.If anyone has ideas on how to investigate further, I'd be glad to pursue them. I apologize if I've overlooked something obvious. Thank you