elixir-nx / ortex

ONNX Runtime bindings for Elixir
MIT License
122 stars 15 forks source link

Unable to use ortex from main #26

Open RudolfVonKrugstein opened 6 months ago

RudolfVonKrugstein commented 6 months ago

Hey, I am trying to use silero_vad and for that it seem to me, I need an up to date version of ort and therefore the current versin of "ortex" in main.

I am doing this on window11.

I tried the following. In my dependencies I added ortex:

      {:ortex, github: "elixir-nx/ortex"}

And then I did:

mix deps.get
mix deps.compile ortex

Resulting in:

==> ortex
Compiling 6 files (.ex)
Compiling crate ortex in release mode (native/ortex)
...
Compiling lib/ortex/native.ex (it's taking more than 10s)
   Compiling ortex v0.1.0 (<project_dir>\deps\ortex\native\ortex)
   = note: `#[warn(unused_imports)]` on by default

warning: `ortex` (lib) generated 1 warning (run `cargo fix --lib -p ortex` to apply 1 suggestion)
    Finished release [optimized] target(s) in 19.43s

== Compilation error in file lib/ortex/native.ex ==
** (File.CopyError) could not copy from "<project_dir>/_build/dev/lib/ortex/priv/native/libonnxruntime.so" to "<project_dir>/_build/dev/lib/ortex/priv/native/libonnxruntime.so.1.17.0": no such file or directory
    (elixir 1.16.1) lib/file.ex:864: File.cp!/3
    lib/ortex/native.ex:9: (module)
could not compile dependency :ortex, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile ortex --force", update it with "mix deps.update ortex" or clean it with "mix deps.clean ortex"

Now, when I look into <project_dir>/_build/dev/lib/ortex/priv/native/ I find a file libortex.dll but not the wanted lionnxruntime.so.

I will look further into this, but if anyone has a hint ...

Thanks, Nathan

RudolfVonKrugstein commented 6 months ago

I get the same error under WSL2, only in that case the existing file is called libortex.so.

gregszumel commented 6 months ago

Hi Nathan, thanks for the detailed description.

I think the issue is coming from lib/ortex/util.ex, specifically here which assumes Linux:

    case "libonnxruntime.so.1.17.0" in onnx_runtime_filenames do
      true ->
        nil

      false ->
        File.cp!(
          Path.join([destination_dir, "libonnxruntime.so"]),
          Path.join([destination_dir, "libonnxruntime.so.1.17.0"])
        )
    end

Does manually removing this chunk solve the issue? We unfortunately do not have Windows machines readily available to check

RudolfVonKrugstein commented 6 months ago

it makes it compile!

I have solved it for me before, by using ortex 0.1.9 and copying onnxruntime.dll and onnxruntime_providers_shared.dll into my project folder.

This is the code that is working with ortex 0.1.9:

    {output, h, c} = Ortex.run(model, {input, sr, h, c})
    result = output |> Nx.backend_transfer()
    [result] = Nx.to_flat_list(result)

That worked, because the result was indeed a one element tensor with a value between 0 and 1.

But with ortex from main, I get a strange pattern marching error on the last line, which is caused because the result is the outputs have changed.

The shapes of the output are now:

Nx.shape(output) = {2,1,64} Nx.shape(h) = {1,1} Nx.shape(c) = {2,1,64}

Suggesting that output is suddenly returned in the middle. I find that very odd, but maybe there is an explanation? Thank!

mortont commented 6 months ago

Good find, that's interesting... I validated that it does indeed switch the order of the output. @gregszumel do you think this could be related to the ort 2.0 changes in main?

gregszumel commented 6 months ago

Sorry for the late reply! It definitely may have something to do with the ort 2.0 changes. I'll dig into this today.

gregszumel commented 6 months ago

I could confirm that there's an output ordering discrepancy in vanilla ort 2.0, so I think it's safe to assume that ortex is just propagating this issue. Specifically, ort moved to using hash-maps for outputs, which doesn't preserve the order of insertion (which I assume is the main issue here). I'll open an issue with them, and ask for guidance as to how they recommend we proceed.

ndrean commented 4 months ago

FYI, I have the same issue on OSX. Commenting the suggested lines from lib/ortex/util.ex makes this compile.

johns10 commented 2 months ago

I also had the same issue on OSX, and commenting those lines made it work for me.