Closed hl closed 7 years ago
I'll look into it and get back to you asap.
Could you try building it on the production machine and seeing what happens?
The production machine doesn't have a compiler installed. When deploying an app the code gets compiled on a build image (Ubuntu with build-essentials) and then copied to a run image (Ubuntu without build-essentials). When I create a run image locally I can login and hash a password (cli) but once that same box is uploaded to the server the process crashes as soon as I call Argon2.hash_pwd_salt/2
I'm not really familiar with docker, hence me using nanobox.io
I've tried the bcrypt_elixir
package and that one works.
Thanks for the info.
Could you try running nanobox deploy with the -v
and / or --debug
option and send me any of the output that you think might be useful? I really need more information.
Same problem here on a production machine only (virtual machine; running Archlinux (4.13.8-1-ARCH)):
[37300.680648] traps: 1_dirty_cpu_sch[15355] trap invalid opcode ip:7f619895b6d4 sp:7f61dabfac30 error:0 in argon2_nif.so[7f6198956000+9000]
[37353.177130] traps: 2_dirty_cpu_sch[15901] trap invalid opcode ip:7fa54e6dd6d4 sp:7fa57cc28c30 error:0 in argon2_nif.so[7fa54e6d8000+9000]
[37594.931293] traps: 2_dirty_cpu_sch[16516] trap invalid opcode ip:7f763cb1b6d4 sp:7f7676b28c30 error:0 in argon2_nif.so[7f763cb16000+9000]
[38458.801572] traps: 2_dirty_cpu_sch[18403] trap invalid opcode ip:7fdb2c4af6d4 sp:7fdb46ce8c30 error:0 in argon2_nif.so[7fdb2c4aa000+9000]
[43766.189557] traps: 2_dirty_cpu_sch[21169] trap invalid opcode ip:7fa347b8d6d4 sp:7fa36dee8c30 error:0 in argon2_nif.so[7fa347b88000+9000]
[44029.687350] traps: 1_dirty_cpu_sch[22674] trap invalid opcode ip:7f334a99d6d4 sp:7f338cbbac30 error:0 in argon2_nif.so[7f334a998000+9000]
[50469.829968] traps: 2_dirty_cpu_sch[5771] trap invalid opcode ip:7f41890186d4 sp:7f41bb3bec30 error:0 in argon2_nif.so[7f4189013000+9000]
[50539.200284] traps: 2_dirty_cpu_sch[6597] trap invalid opcode ip:7f88818c36d4 sp:7f88bb42cc30 error:0 in argon2_nif.so[7f88818be000+9000]
[50985.231467] traps: 1_dirty_cpu_sch[7010] trap invalid opcode ip:7f1f21c886d4 sp:7f1f640edc30 error:0 in argon2_nif.so[7f1f21c83000+9000]
saw a core dump when using the command 'journalctl -xe'
systemd-coredump[6630]: Process 6338 (beam.smp) of user 1000 dumped core.
Stack trace of thread 6597:
#0 0x00007f88818c36d4 n/a (/home/user1/phoenix_apps/my_app/lib/argon2_elixir-1.2.8/priv/argon2_nif.so)
deployment was done with destillery using the following configuration:
environment :prod do
set include_erts: true
set include_src: false
..
end
What version of Erlang are you using?
Using here Erlang/OTP 20 [erts-9.0.1] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false] and Elixir 1.5.2.
It seems to me, that the error occured when calling
Comeonin.Argon2.add_hash(password)
There was no hint in the online logs, neither was there an erl_crash.dump file. So its somewhat difficult to track down.
Erlang/OTP 20 [erts-9.1] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Interactive Elixir (1.5.2) - press Ctrl+C to exit (type h() ENTER for help)
I'm using this pkg if it helps
http://pkgsrc.nanobox.io/nanobox/base/Linux/argon2-20161029nb1.tgz
I'm confused as to why you need a separate argon2 package. The argon2_elixir hex package contains all the code you need.
Got the same problem too.
Server Ubuntu 512MB RAM (DigitalOcean the cheapest one)
here is my Erlang/OTP version
Erlang/OTP 20 [erts-9.1] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:10] [hipe] [kernel-poll:false]
I'd already included argon2_elixir
in mix file. Compile it with Distillery inside docker. While building, I'm using image from bitwalker/alpine-elixir:1.5.2 and release image is alpine:3.6 (bitwalker/alpine-elixir:1.5.2 is also based on alpine:3.6).
Since Argon2 is memory intensive hashing algorithm. Could it be that the server cannot process Argon2 hashing because of the lack of memory?
@riverrun my bad, should have added a little bit of context. I've installed that package as well and am able to run it on the command line without any problems.
I'm also running a "512 MB Memory / 20 GB Disk / AMS2 - Ubuntu 16.04.3 x64" Digital Ocean box
Version 1.2.9 has a small update to the C code.
Let me know if that works.
@riverrun thanx for the hint! I am now using Argon2.hash_pwd_salt(password) directly with the 1.2.9 version but still getting the same error on production machine.
What I have tried too:
MIX_ENV=prod mix phx.server
=> works! But isn't really a solution.MIX_ENV=prod mix release
and then moving the build tar.gz file to the desired directory (as described here: https://hexdocs.pm/distillery/walkthrough.html). Extracting it there and starting the server with ./bin/my_app foreground
=> works only until calling Argon2.hash_pwd_salt(password). Memory on my production machine should be sufficient: free memory is about 2.8GB (out of 6GB, mostly because of some other rails apps).
@riverrun Thank for the quick fix! But it's still not working for me.
Also tried various parameter such as t_cost: 1
, m_cost: 1
, argon2_type: 0
, argon2_type: 2
, parallelism: 0
(not sure how this parallel works) and hashlen: 16
but none of them works.
One thing I found different than the others is when I run Argon2.hash_pwd_salt("qweqwe", format: :raw_hash)
there is no terminating message, just nothing. The container just stop. Before that there is this error message.
*** ERROR: Shell process terminated! (^G to start new job) ***
Just tried call encodedlen_nif
manually with default params and it crash, this might help scope the problem.
> Argon2.Base.encodedlen_nif(6, 16, 1, 16, 32, 1)
*** ERROR: Shell process terminated! (^G to start new job) ***
Do you all have erlang-crypto installed?
If you are all on Ubuntu, try running dpkg --list | grep erlang
and see if you have erlang-crypto and erlang-dev (I'm not sure if this one is needed, but ...) installed.
I'm on Archlinux (development and production machines). There is no erlang-crypto or erlang-dev package see: https://www.archlinux.org/packages/?q=erlang
got some more information using command coredumpctl gdb <id>
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/home/user1/phoenix_apps/my_app/erts-9.0.1/bin/beam.smp -- -root /home/user1/phoenix_apps/my_app -progname home/user1/phoenix_apps/my_app/releases/0.0.1/my_app.sh -- -home /home/user1 -- -boot /home/user1/phoenix_apps/my_app/releases/0.0.1/my_app -boot_var ERTS_LIB_DIR /home/user1/phoenix_apps/my_app/erts-9.0.1/../lib -pa /home/user1/phoenix_apps/my_app/lib/my_app-0.0.1/consolidated -name my_app@127.0.0.1 -setcookie YOs9oKm:_S$.T^K$dyTZ:ZPm=SVkdX=EY~_sgR}c;AeP2F!OD(c(W0rdmdeDD2Q2 -smp auto -config /home/user1/phoenix_apps/my_app/var/sys.config -mode embedded -user Elixir.IEx.CLI -extra --no-halt +iex -- console'.
Program terminated with signal SIGILL, Illegal instruction.
#0 argon2_hash_nif (env=0x7f6832ffad80, argc=<optimized out>, argv=<optimized out>) at c_src/argon2_nif.c:90
90 c_src/argon2_nif.c: Datei oder Verzeichnis nicht gefunden (file or directory not found).
[Current thread is 1 (Thread 0x7f6832ffb700 (LWP 24628))]
(gdb)
I made some small changes to the master branch. Could you try it out?
same again
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/home/user1/phoenix_apps/my_app/erts-9.0.1/bin/beam.smp -- -root /home/user1 ....'.
Program terminated with signal SIGILL, Illegal instruction.
#0 argon2_hash_nif (env=0x7f03080aad80, argc=<optimized out>, argv=<optimized out>) at c_src/argon2_nif.c:90
90 c_src/argon2_nif.c: Datei oder Verzeichnis nicht gefunden (file or directory not found).
[Current thread is 1 (Thread 0x7f03080ab700 (LWP 29004))]
backtrace with gdb:
(gdb) bt
#0 argon2_hash_nif (env=0x7f03080aad80, argc=<optimized out>, argv=<optimized out>) at c_src/argon2_nif.c:90
#1 0x00000000005e1aff in erts_call_dirty_nif ()
#2 0x00000000004485d3 in erts_dirty_process_main ()
#3 0x00000000004fa4d3 in ?? ()
#4 0x000000000069a061 in ?? ()
#5 0x00007f034adb308a in start_thread () from /usr/lib/libpthread.so.0
#6 0x00007f034a8e324f in clone () from /usr/lib/libc.so.6
on development machine file argon2_nif.c is here: ./deps/argon2_elixir/c_src/argon2_nif.c
so why is the production version looking for argon2_nif.c? Shouldn't look it for lib/argon2_elixir-1.2.9/priv/argon2_nif.so instead?
Tried the master one and still not working. I think the problem is that the NIF cannot be loaded for some reasons.
Here are my debugging logs
TL;DR
Argon2.Base.init
, also with the code in load_nif
function.{:error, :bad_lib, ...}
/home/app # ls
bin erts-9.1 lib releases var
/home/app # ls lib/argon2_elixir-1.2.9/priv/
argon2_nif.so
/home/app # bin/myapp remote_console
Erlang/OTP 20 [erts-9.1] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:10] [hipe] [kernel-poll:false]
Interactive Elixir (1.5.2) - press Ctrl+C to exit (type h() ENTER for help)
iex(myapp@127.0.0.1)1> Argon2.Base.init
** (RuntimeError) An error occurred when loading Argon2.
Make sure you have a C compiler and Erlang 20 installed.
If you are not using Erlang 20, either upgrade to Erlang 20 or
use bcrypt_elixir (version 0.12) or pbkdf2_elixir.
See the Comeonin wiki for more information.
(argon2_elixir) lib/argon2/base.ex:16: Argon2.Base.init/0
iex(myapp@127.0.0.1)1> path = :filename.join(:code.priv_dir(:argon2_elixir), 'argon2_nif')
'/home/app/lib/argon2_elixir-1.2.9/priv/argon2_nif'
iex(myapp@127.0.0.1)2> :erlang.load_nif(path, 0)
{:error,
{:bad_lib,
'Library module name \'Elixir.Argon2.Base\' does not match calling module \'erl_eval\''}}
@zentetsukenz do you have erlang-crypto and erlang-dev installed?
@riverrun , I just tried install it to my release image and still not working. Here is my Dockerfile
#
# Builder image
#
FROM bitwalker/alpine-elixir:1.5.2 AS builder
MAINTAINER "Wiwatta Mongkhonchit" <zentetsukenz@gmail.com>
RUN apk add --no-cache nodejs nodejs-npm
RUN apk --no-cache --update upgrade musl
WORKDIR /home
COPY . app
WORKDIR /home/app
ENV MIX_ENV=prod
RUN apk add --no-cache alpine-sdk \
&& mix local.hex --force \
&& mix local.rebar --force \
&& mix deps.get \
&& cd assets && npm install > /dev/null && node_modules/brunch/bin/brunch b -p > /dev/null && cd .. \
&& mix phx.digest \
&& mix release
#
# Release image
#
FROM alpine:3.6 AS release
MAINTAINER "Wiwatta Mongkhonchit" <zentetsukenz@gmail.com>
RUN apk --no-cache add ca-certificates bash openssl erlang-crypto erlang-dev
ENV MIX_ENV=prod
ARG VERSION
COPY --from=builder /home/app/_build/prod/rel/myapp/releases/$VERSION/myapp.tar.gz /home/myapp.tar.gz
WORKDIR /home
RUN mkdir app
RUN tar -C app -zxvf myapp.tar.gz > /dev/null \
&& rm myapp.tar.gz > /dev/null
WORKDIR app
EXPOSE 4000
ENTRYPOINT ["bin/myapp"]
CMD ["foreground"]
Actually, I think this is the problem with NIF cross-compilation which mentioned in Distillery guide. But I thought this could be fixed with Docker build/release with the same base image. Seem like I was wrong.
Will trying to prove that and comeback when I got something new.
PS.
Just confirm my theory above, I'd copied the source code to prod server and run docker build there. Argon2 is working now.
removed@the-cheapest-ubuntu-on-digital-ocean:~$ docker exec -it myapp /bin/ash
/home/app # bin/myapp remote_console
Erlang/OTP 20 [erts-9.1.3] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:10] [hipe] [kernel-poll:false]
Interactive Elixir (1.5.2) - press Ctrl+C to exit (type h() ENTER for help)
iex(myapp@127.0.0.1)1> Argon2.hash_pwd_salt("qweqwe")
"$argon2i$v=19$m=65536,t=6,p=1$cj3/gWL8fK5GuAcDP9qOAQ$byScVSCYBj3LU2hJj0HVoJrfG5gD9uLN5IgJLsYWNWg"
iex(myapp@127.0.0.1)2>
Just notice a slightly increase in image size too. So there might be something different between building on MacOS and Ubuntu.
removed@the-cheapest-ubuntu-on-digital-ocean:~$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu/myapp 0.0.3 a89bb7a9d1d5 About a minute ago 62.9MB
<none> <none> 46dce4ee9207 About a minute ago 405MB
elixir 1.5-alpine 86592d78af10 12 hours ago 80.1MB
macos/myapp 0.0.3 27cda4fd3cbd 13 hours ago 59.5MB
I'm going to try a few changes in the Makefile. I'll let you know when they are ready, and you can try them out.
I've made a few changes to the Makefile in the develop branch.
In your case, it should print out a message like this (when compiling): **** CROSSCOMPILING ***
Let me know how that works.
@riverrun Many thanks for the prompted response. I'm trying now.
@riverrun Sorry for the late (Just figure out how to make mix deps with git repo + submodule works)
The result is it's still not working.
This is a log when it compiled Argon2
==> argon2_elixir
mkdir -p priv
cc -std=c89 -pthread -O3 -Wall -g -Iargon2/include -Iargon2/src -Ic_src -I/usr/local/lib/erlang/erts-9.1.2/include -march=native -shared -fPIC -Wl,-soname,libargon2.so.0 argon2/src/argon2.c argon2/src/core.c argon2/src/blake2/blake2b.c argon2/src/thread.c argon2/src/encoding.c c_src/argon2_nif.c argon2/src/opt.c -o priv/argon2_nif.so
Compiling 3 files (.ex)
Generated argon2_elixir app
Compile inside Alpine Linux 3.6 container with MacOS host. So, there is no Crosscompiling message that you just put in this commit
Could you send me the error message from running Argon2.hash_pwd_salt("")?
@riverrun Same as others, the container just crashed. Looks like a segfault to me.
removed@the-cheapest-ubuntu-on-digital-ocean:~$ docker exec -it myapp /bin/ash
/home/app # bin/myapp remote_console
Erlang/OTP 20 [erts-9.1.2] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:10] [hipe] [kernel-poll:false]
Interactive Elixir (1.5.2) - press Ctrl+C to exit (type h() ENTER for help)
iex(myapp@127.0.0.1)1> Argon2.hash_pwd_salt("")
*** ERROR: Shell process terminated! (^G to start new job) ***
Let me try the suggestion.
Tried the suggestion, still not working.
Hi @riverrun,
This is effecting our production services also. I've tried reverting to 1.2.8, but the segfault is still there.
Has anyone got a fix for this yet?
Downgrading to Elixir 1.4.5 and Erlang19 has solved my issue for now
I really need more information to help solve this issue.
Is it just production - or in development as well?
Are there any useful error messages, or does it just segfault?
There is no output, it's crashes with nothing.
I'm using Distillery, which seems to be a common theme with the other reports.
I build and run in the same image, so I do not believe this to be cross compiling related.
I'm not sure why it's OK with Erlang 19
How / what other information can I get you?
@riverrun you pointed out that you have made a view changes in the Makefile at the develop branch. However, there seems to be no branch "develop" (or "development"), only "master". How do I get this branch? In addition, is the backtrace (produced with gdb) of the crash that I have delivered 8 days ago insufficient? What exactly do you need?
Are you all using multiple cores when deploying the app? See #5 for more info, but basically the dirty schedulers need more than one core provided to work properly.
@mihya the develop branch was deleted several days ago. Are you having the same problem, where it is not working in production but working in development?
Using 4 cores on development machine and 2 cores on production machine (uname -r is on each machine identically: 4.13.11-1-ARCH (running on Archlinux).
@riverrun yes, same problem here. Works fine on development machine, crashes on production machine. Here is a writeup of what I run into:
building the release with destillery:
MIX_ENV=prod mix release --verbose
gives me:
==> Building release myapp:0.0.1 using environment prod
==> One or more direct or transitive dependencies are missing from
:applications or :included_applications, they will not be included
in the release:
:elixir_make
(save to ignore, I assume) and
...
argon2_elixir-1.2.9
from: _build/prod/lib/argon2_elixir
applications:
:kernel
:stdlib
:elixir
:logger
includes: none
...
and
... ==> Including ERTS 9.0.1 from /usr/lib/erlang/erts-9.0.1 ...
testing out the release on development machine with:
_build/prod/rel/myappt/bin/myapp console
and at the console:
Elixir.Argon2.hash_pwd_salt("asdfasdf")
gives me:
"$argon2i$v=19$m=65536,t=6,p=1$EaLRERI66Cax1+KxsiSubw$hOXgWjjXjlhK2glne2Ewenvl2tJ2m1/Yw2qbA3gSfA4"
(so it's working fine here)
now, when moving the release to production machine (with scp and unzipping with tar) starting the console again:
./bin/myapp console
and testing argon2_elixir with:
Elixir.Argon2.hash_pwd_salt("asdfasdf")
gives me:
Ungültiger Maschinenbefehl (Speicherabzug geschrieben) (Invalid machine command, memory dump written)
when reading out the memory dump with coredumpctl info:
Signal: 4 (ILL)
Timestamp: Sat 2017-11-04 11:04:59 CET (5min ago)
Command Line: /home/user1/phoenix_apps/myapp/erts-9.0.1/bin/beam.smp -- -root /home/user1/phoenix_apps/myapp -progname home/user1/phoenix_apps/myapp/releases/0.0.1/myapp.sh -- -home /home/user1 -- -boot /home/user1/phoenix_apps/myapp/releases/0.0.1/
Executable: /home/user1/phoenix_apps/myapp/erts-9.0.1/bin/beam.smp
...
Message: Process 2642 (beam.smp) of user 1000 dumped core.
Stack trace of thread 2902:
#0 0x00007f49c779f6d4 n/a (/home/user1/phoenix_apps/myapp/lib/argon2_elixir-1.2.9/priv/argon2_nif.so)
getting the backtrace of that dump (with coredumpctl gdb
Reading symbols from /home/user1/phoenix_apps/myapp/erts-9.0.1/bin/beam.smp...(no debugging symbols found)...done.
...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/home/user1/phoenix_apps/myapp/erts-9.0.1/bin/beam.smp -- -root /home/user1/phoenix_apps/myapp -progname home/user1/phoenix_apps/myapp/releases/0.0.1/myapp.sh -- -home /home/user1 -- -boot /home/user1/phoenix_apps/myapp/releases/0.0.1/'.
Program terminated with signal SIGILL, Illegal instruction.
#0 argon2_hash_nif (env=0x7f4a0533ed80, argc=<optimized out>, argv=<optimized out>) at c_src/argon2_nif.c:90
90 c_src/argon2_nif.c: Datei oder Verzeichnis nicht gefunden (File or directory not found).
Thanks for the detailed report.
Could you try the following two things and let me know if there's any difference?
hash_pwd_salt
with lower memory - Argon2.hash_pwd_salt("asdfasdf", m_cost: 8)
disable dirty scheduling (for testing purposes) - in the argon2_nif.c
file, replace:
{"hash_nif", 10, argon2_hash_nif, ERL_NIF_DIRTY_JOB_CPU_BOUND}, {"verify_nif", 3, argon2_verify_nif, ERL_NIF_DIRTY_JOB_CPU_BOUND},
with:
{"hash_nif", 10, argon2_hash_nif},
{"verify_nif", 3, argon2_verify_nif},
Thanx for your reply. I tried as follows:
hash_pwd_salt
with lower memory - Argon2.hash_pwd_salt("asdfasdf", m_cost: 8)
gives me:
Reading symbols from /home/user1/phoenix_apps/myapp/erts-9.0.1/bin/beam.smp...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/home/user1/phoenix_apps/myapp/erts-9.0.1/bin/beam.smp -- -root /home/...'.
Program terminated with signal SIGILL, Illegal instruction.
#0 argon2_hash_nif (env=0x7faddcfbad80, argc=<optimized out>, argv=<optimized out>) at c_src/argon2_nif.c:90
90 c_src/argon2_nif.c: Datei oder Verzeichnis nicht gefunden (File or Directory not found).
2a. disabling dirty scheduling
changed code in file deps/argon2_elixir/c_src/argon2_nif.c
as advised, run MIX_ENV=prod mix release
and copied the release to production machine. After calling .bin/myapp console
running there the command: Elixir.Argon2.hash_pwd_salt("asdfasdf")
gives me the same error messages as above
2b. disabling dirty scheduling and after calling .bin/myapp console
running command:
Argon2.hash_pwd_salt("asdfasdf", m_cost: 8)
gives me again same error message as above too.
Thanks. It look like it's nothing to do with the dirty schedulers or memory use. I'll get back to you again soon.
I'm trying to get Argon2 to work on my production server (via Nanobox.io) but run into a problem. On my local machine, with nanobox, everything works but as soon as I deploy it to the server I end up with below error:
all the files