elixir-nx / xla

Pre-compiled XLA extension
Apache License 2.0
83 stars 21 forks source link

Play nicely with Elixir's releases. #71

Open hickscorp opened 5 months ago

hickscorp commented 5 months ago

I've noticed that setting XLA_TARGET=xxx mix release compute will result in the libexla_extension.so being built with this name always, regardless of xxx. This is a bit problematic, because it implies having as many releases defined as we want to target different architectures. For example, our mix.exs file looks like this:

  defp releases(),
    do:
      [
        web: [
          version: "0.0.1",
          applications: [...]
        ],
        worker: [
          version: "0.0.1",
          applications: [...]
        ]
      ]
      |> add_compute_releases([:cuda120, :rocm])

  defp add_compute_releases(acc, xla_targets),
    do:
      xla_targets
      |> Enum.reduce(acc, fn xla_target, acc ->
        acc ++
          [
            {:"compute_#{xla_target}",
             [
               version: "0.0.1",
               applications: [...]
             ]}
          ]
      end)

This results in a list of releases available such as web, worker, compute_cuda120, compute_rocm. This is a bit cumbersome, given that the only thing that changes between the two compute_xxx releases seems to be libexla_extension.so...

A few ideas here:

Let me know if I misunderstood anything!

jonatanklosko commented 5 months ago

As far as I understand, the OpenXLA goal is eventually have each platform abstracted away into a self-contained plugin, that can be compiled separately and loaded dynamically. Once that's stable we may have a single base binary and several plugin binaries that can be downloaded, which should make it easy to support multiple as well. This is an ongoing work though, by a brief look, JAX did it for CUDA, but they reverted it from suggested setup instructions because there were issues, so it still seems pretty experimental.

We could work out a solution with the current binaries, but honestly I think it's worth waiting for the plugins. FWIW it's a rare use case to build for both CUDA and ROCm (note that we don't even have precompiled archives for ROCm), and using CPU and CUDA is possible with the single CUDA binary.