elixir-nx / bumblebee

Pre-trained Neural Network models in Axon (+ 🤗 Models integration)
Apache License 2.0
1.27k stars 90 forks source link

Weird behaviour with progress status. #324

Closed hickscorp closed 5 months ago

hickscorp commented 5 months ago

Trying to use a model with Bumblebee, and I'm getting this kind of log:

{:ok, #PID<0.769.0>}
|===================================================                                                 |  51% (25.76/50.78 KB)Failed to write log message to stdout, trying stderr
[error] GenServer #PID<0.65.0> terminating
** (FunctionClauseError) no function clause matching in :prim_tty.cols/2
    (kernel 9.0.1) prim_tty.erl:935: :prim_tty.cols([ansi: "\e[2K"], true)
    (kernel 9.0.1) prim_tty.erl:933: :prim_tty."-cols_multiline/4-lc$^0/1-0-"/3
    (kernel 9.0.1) prim_tty.erl:933: :prim_tty.cols_multiline/4
    (kernel 9.0.1) prim_tty.erl:628: :prim_tty.handle_request/2
    (kernel 9.0.1) prim_tty.erl:551: :prim_tty.handle_request/2
    (kernel 9.0.1) user_drv.erl:781: :user_drv.io_request/2
    (kernel 9.0.1) user_drv.erl:834: :user_drv.io_requests/2
    (kernel 9.0.1) user_drv.erl:777: :user_drv.io_request/2
    (kernel 9.0.1) user_drv.erl:887: :user_drv.handle_req/3
    (kernel 9.0.1) user_drv.erl:491: :user_drv.server/3
    (stdlib 5.0.1) gen_statem.erl:1377: :gen_statem.loop_state_callback/11
    (stdlib 5.0.1) proc_lib.erl:241: :proc_lib.init_p_do_apply/3
Last message: {:EXIT, #PID<0.70.0>, {:function_clause, [{:prim_tty, :cols, [[ansi: "\e[2K"], true], [file: ~c"prim_tty.erl", line: 935]}, {:prim_tty, :"-cols_multiline/4-lc$^0/1-0-", 3, [file: ~c"prim_tty.erl", line: 933]}, {:prim_tty, :cols_multiline, 4, [file: ~c"prim_tty.erl", line: 933]}, {:prim_tty, :handle_request, 2, [file: ~c"prim_tty.erl", line: 628]}, {:prim_tty, :handle_request, 2, [file: ~c"prim_tty.erl", line: 551]}, {:user_drv, :io_request, 2, [file: ~c"user_drv.erl", line: 781]}, {:user_drv, :io_requests, 2, [file: ~c"user_drv.erl", line: 834]}, {:user_drv, :io_request, 2, [file: ~c"user_drv.erl", line: 777]}, {:user_drv, :handle_req, 3, [file: ~c"user_drv.erl", line: 887]}, {:user_drv, :server, 3, [file: ~c"user_drv.erl", line: 491]}, {:gen_statem, :loop_state_callback, 11, [file: ~c"gen_statem.erl", line: 1377]}, {:proc_lib, :init_p_do_apply, 3, [file: ~c"proc_lib.erl", line: 241]}]}}
State: {:state, :user_sup, :undefined, #PID<0.70.0>, {#PID<0.65.0>, :user_sup}}
Failed to write log message to stdout, trying stderr
[error] Task #PID<0.771.0> started from EAIML.Bumblebee.FlanT5Test terminating
** (stop) :terminated
    (stdlib 5.0.1) io.erl:94: :io.put_chars(:standard_io, ["\e[2K\r", "|", ["=", "                                                         "], "|", ["   1%", " (0.02/2.42 MB)", ""]])
    (bumblebee 0.4.2) lib/bumblebee/utils/http.ex:100: Bumblebee.Utils.HTTP.download_receive/2
    (bumblebee 0.4.2) lib/bumblebee/utils/http.ex:52: Bumblebee.Utils.HTTP.download/3
    (bumblebee 0.4.2) lib/bumblebee/huggingface/hub.ex:96: Bumblebee.HuggingFace.Hub.cached_download/2
    (bumblebee 0.4.2) lib/bumblebee.ex:784: Bumblebee.load_tokenizer/2
    (eai_ml 0.1.0) lib/eai_ml/bumblebee/flant5_test.ex:54: anonymous fn/0 in EAIML.Bumblebee.FlanT5Test.init/1
    (elixir 1.16.0) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
    (elixir 1.16.0) lib/task/supervised.ex:36: Task.Supervised.reply/4
Function: #Function<1.36664021/0 in EAIML.Bumblebee.FlanT5Test.init/1>
    Args: []

Using Erlang/OTP 26 [erts-14.0.1] along with Elixir (1.16.0) - probably precompiled from ASDF using:

elixir 1.16
erlang 26.0.1

This problem goes away with erlang 26.2.1.

josevalim commented 5 months ago

Can you please try a more recent Erlang version? Either 26.0.2 or 26.2? It looks like an Erlang bug to me.

hickscorp commented 5 months ago

Yep - thanks Jose! The problem goes away with 26.2.1, I was trying other versions as you were answering. Closing issue!