Closed lukegalea closed 1 year ago
Hi @lukegalea!
System.put_env("ENV_VAR", "value")
before the Mix.install
call on the top cellAh, you can also pass system_env
to Mix.install, like this:
Mix.install(
[
...,
],
system_env: %{"TOKENIZERS_BUILD" => "true"}
)
The pros/cons of this approach is that each configuration will lead to a different build.
@josevalim Thanks for the response! I'm hoping to make it super easy to get Livebook going on Google Colab and I'm doing what I can to keep it transparent to the user by not requiring any special config in each notebook to "dance around colab weirdness".
As you suggested, I was able to get the env vars into the standalone runtime. Thanks!
Now we just have the need to add each notebook's setup:
Mix.install(
[
{:rustler, ">= 0.0.0"}
]
)
Works good enough for now, but maybe the "Neural Network Task" smart cell should add Rustler alongside kino_bumblebee and exla, knowing that without it no forced rebuilds will work?
@lukegalea we can also start doing precompilation for the Google Colab environment. What is the error you got? Do you know what is the architecture they use there? What does :erlang.system_info(:system_architecture)
in the Livebook returns?
Colab runs in Ubuntu 18.04, so you'll get errors for both EXLA and Tokenizers having been compiled against GLIBC that's too new.
The system architecture is "'x86_64-pc-linux-gnu'", so I think the real root cause of all this is that Rustler_Precompiled isn't aware of the GLIBC version, just architecture, so we can't add native builds of Tokenizers and EXLA as it's not smart enough to pick the correct version.
I've been getting around it by hard coding XLA_URL to my specific build but I think the better solve might be to make rustler_precompiled GLIBC target aware.
I see. The troubling part is that Ubuntu 18 is going EOL early 2023.
Could you update GCC and point Elixir to use it?
https://stackoverflow.com/questions/72513993/how-to-install-glibc-2-29-or-higher-in-ubuntu-18-04
Btw, note that you can also run Livebook on HuggingFace right now, since they added Docker support. We are looking into making it even smoother soon!
I'll give downloading new glibc a try and report back. It would be nice if that worked!
Re Ubuntu 18.04, ya.. that's not great. Looks like the devs have been aware for a LONG TIME and haven't taken any action -> https://github.com/googlecolab/colabtools/issues/1880
Thanks for the suggestion but no luck. Trying to run multiple GLIBCs just ended up breaking colab entirely.
Good discussion on the problem here https://stackoverflow.com/questions/847179/multiple-glibc-libraries-on-a-single-host/851229#851229
but I think it's a reasonable solve for me to just add some documentation to the colab notebook that warns users to add rustler to the setup block as a work around.
@lukegalea we used to precompile XLA on ubuntu18, but we run into a number of segfaults when we updated to newer XLA and it turned out that those are resolved by using a newer GLIBC. So while building with an older glibc works, users may run into such segfults.
Standalone runtime does not include key dependencies that are required to support BumbleBee/Cuda.
To get BumbleBee running on older Ubuntu, I've had to add Rustler to dependencies and set a few env vars (XLA_BUILD, XLA_TARGET, XLA_ARCHIVE_URL, TOKENIZERS_BUILD), but the standalone runtime doesn't inherit any of these and I can't find an easy way to configure.
The solution proposed at https://github.com/livebook-dev/livebook/issues/1579 to pass an env variable to force recompilation isn't likely to address this concern as we don't want to manually recompile EXLA every time.