livebook-dev / livebook

Automate code & data workflows with interactive Elixir notebooks
https://livebook.dev
Apache License 2.0
5.01k stars 429 forks source link

Cannot set dependencies and environment vars when launching standalone runtime #1598

Closed lukegalea closed 1 year ago

lukegalea commented 1 year ago

Standalone runtime does not include key dependencies that are required to support BumbleBee/Cuda.

To get BumbleBee running on older Ubuntu, I've had to add Rustler to dependencies and set a few env vars (XLA_BUILD, XLA_TARGET, XLA_ARCHIVE_URL, TOKENIZERS_BUILD), but the standalone runtime doesn't inherit any of these and I can't find an easy way to configure.

The solution proposed at https://github.com/livebook-dev/livebook/issues/1579 to pass an env variable to force recompilation isn't likely to address this concern as we don't want to manually recompile EXLA every time.

josevalim commented 1 year ago

Hi @lukegalea!

  1. You can set env vars at the top of a notebook by calling System.put_env("ENV_VAR", "value") before the Mix.install call on the top cell
  2. You can also set env vars in the Settings page that apply to all notebooks
  3. If you are starting Livebook via the CLI, any variable given there should be available in the notebook too :) I just double checked it here!
josevalim commented 1 year ago

Ah, you can also pass system_env to Mix.install, like this:

Mix.install(
  [
    ...,
  ],
  system_env: %{"TOKENIZERS_BUILD" => "true"}
)

The pros/cons of this approach is that each configuration will lead to a different build.

lukegalea commented 1 year ago

@josevalim Thanks for the response! I'm hoping to make it super easy to get Livebook going on Google Colab and I'm doing what I can to keep it transparent to the user by not requiring any special config in each notebook to "dance around colab weirdness".

As you suggested, I was able to get the env vars into the standalone runtime. Thanks!

Now we just have the need to add each notebook's setup:

Mix.install(
  [
     {:rustler, ">= 0.0.0"}
  ]
)

Works good enough for now, but maybe the "Neural Network Task" smart cell should add Rustler alongside kino_bumblebee and exla, knowing that without it no forced rebuilds will work?

josevalim commented 1 year ago

@lukegalea we can also start doing precompilation for the Google Colab environment. What is the error you got? Do you know what is the architecture they use there? What does :erlang.system_info(:system_architecture) in the Livebook returns?

lukegalea commented 1 year ago

Colab runs in Ubuntu 18.04, so you'll get errors for both EXLA and Tokenizers having been compiled against GLIBC that's too new.

The system architecture is "'x86_64-pc-linux-gnu'", so I think the real root cause of all this is that Rustler_Precompiled isn't aware of the GLIBC version, just architecture, so we can't add native builds of Tokenizers and EXLA as it's not smart enough to pick the correct version.

I've been getting around it by hard coding XLA_URL to my specific build but I think the better solve might be to make rustler_precompiled GLIBC target aware.

josevalim commented 1 year ago

I see. The troubling part is that Ubuntu 18 is going EOL early 2023.

Could you update GCC and point Elixir to use it?

https://stackoverflow.com/questions/72513993/how-to-install-glibc-2-29-or-higher-in-ubuntu-18-04

josevalim commented 1 year ago

Btw, note that you can also run Livebook on HuggingFace right now, since they added Docker support. We are looking into making it even smoother soon!

lukegalea commented 1 year ago

I'll give downloading new glibc a try and report back. It would be nice if that worked!

Re Ubuntu 18.04, ya.. that's not great. Looks like the devs have been aware for a LONG TIME and haven't taken any action -> https://github.com/googlecolab/colabtools/issues/1880

lukegalea commented 1 year ago

Thanks for the suggestion but no luck. Trying to run multiple GLIBCs just ended up breaking colab entirely.

Good discussion on the problem here https://stackoverflow.com/questions/847179/multiple-glibc-libraries-on-a-single-host/851229#851229

but I think it's a reasonable solve for me to just add some documentation to the colab notebook that warns users to add rustler to the setup block as a work around.

jonatanklosko commented 1 year ago

@lukegalea we used to precompile XLA on ubuntu18, but we run into a number of segfaults when we updated to newer XLA and it turned out that those are resolved by using a newer GLIBC. So while building with an older glibc works, users may run into such segfults.