tazz4843 / whisper-rs

Rust bindings to https://github.com/ggerganov/whisper.cpp
The Unlicense
607 stars 105 forks source link

Unable to use Metal feature on Mac M1 Max (32 GB) #108

Open valiksb opened 5 months ago

valiksb commented 5 months ago

I'm fairly new to Rust and I wanted to start with a project that allows me to learn it while I also do something that I like and that is Whisper. whisper-rs seems like the ideal solution for this but I got a bit into some problems, see below

  1. I created a brand new Rust project and copied https://github.com/tazz4843/whisper-rs/blob/master/examples/audio_transcription.rs and renamed it main.rs
  2. made the suggested changes to my Cargo.toml [dependencies] hound = "3" whisper-rs = { version = "0.10.0" }
  3. Then I did cargo run
  4. It worked fine but it took took long compared to whisper.cpp

I added this line println!("[{}]", print_system_info()); to see what was been used and this is the output [AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 0 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | ]

then I enable Metal with features = ["metal"] and run again, I got this sysinfo and a ASSERT [AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | ] and this is the ASSERT ggml_metal_graph_compute: command buffer 0 failed with status 5 GGML_ASSERT: /path_to/my_whisper_sample/target/release/build/whisper-rs-sys-2c6c0c9736fdf6b6/out/whisper.cpp/ggml-metal.m:1611: false

I dont believe this to be a bug, but I was not sure were to post this question/issue I'm having

Thanks in advance

tazz4843 commented 5 months ago

I've not used Metal much so am unable to help much here. Perhaps modifying build.rs to unconditionally build Metal might help?

valiksb commented 5 months ago

thanks. Let me try that

magnusanderson-wk commented 3 months ago

Did you figure anything out? I learned that if you copy the files explicitely ggml-metal.h, ggml-metal.m, ggml-metal.metal into the current working directory, (I think whisper.cpp) it will try a few paths and eventually load them. But inference doesn't work (produces garbage output). This is frustrating as I would rather use rust for my programs compared to c++.

(I have since switched to using coreml)

dev-msp commented 2 months ago

(I'm using an M1 Pro on 13.6.2)

I'm new to debugging Rust lib build steps, so I'm sure there's terminology I'm not using that would make this clearer!

I tried cloning this project and adding a missing file (ggml-common.h) to the include list in the sys directory's Cargo.toml and now I find that the main issue is the inflexible way that whisper.cpp tries to source the Metal files.

At runtime, it either looks in an environment variable or falls back to the current directory, neither of which correspond the location of the Metal files, which end up in target/.

I'm not sure if whisper.cpp needs patching to fix this, or if there is some magic we can do in the whisper-rs build process to make runtime behavior work cleanly and out of the box.

But I can report that when I set the GGML_METAL_PATH_RESOURCES environment variable at the right target/ subfolder (specifically the out directory under whisper-rs-sys), inference works properly and with the expected speed.

Also - the log trampoline doesn't seem to capture the GGML_METAL_LOG_{LEVEL} calls.

tazz4843 commented 2 months ago

But I can report that when I set the GGML_METAL_PATH_RESOURCES environment variable at the right target/ subfolder (specifically the out directory under whisper-rs-sys), inference works properly and with the expected speed.

If this is all it takes then this might be an easy fix. I can draft a PR.

dev-msp commented 2 months ago

Oh awesome - I wasn't sure if we'd really need to set this environment variable at runtime to make it work, or if we do, whether that's acceptable practice.

tazz4843 commented 2 months ago

If we need to set it at runtime I think it would be best to raise an upstream issue, but I think we'll be able to do it in build.rs without issues based off what you've said.

tazz4843 commented 2 months ago

See 39042a8a15d1b49fc6fed4dbf318d0bb8984a6f8, try branch try-fix-metal-build to see if it works.

dev-msp commented 2 months ago

Awesome - If I understand correctly now, I think my original description was slightly off - GGML_METAL_PATH_RESOURCES should be the whisper.cpp dir, not its parent(?) - the build dir for whisper-rs-sys.

tazz4843 commented 2 months ago

Sorry that took me a bit to get back to: maybe 1e3adf1c490682a3b1f82a045a4232f116629868

eftychis commented 2 months ago

@tazz4843: I am personally getting the following on 1e3adf1.

ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1 Pro
ggml_metal_init: picking default device: Apple M1 Pro
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: error: could not use bundle path to find ggml-metal.metal, falling back to trying cwd
ggml_metal_init: loading 'ggml-metal.metal'
ggml_metal_init: error: Error Domain=NSCocoaErrorDomain Code=260 "The file “ggml-metal.metal” couldn’t be opened because there is no such file." UserInfo={NSFilePath=ggml-metal.metal, NSUnderlyingError=0x600000f1d980 {Error Domain=NSPOSIXErrorDomain Code=2 "No such file or directory"}}
whisper_backend_init: ggml_backend_metal_init() failed

So it doesn't seem to be picking it up with the new change.

I can also pass it manually when running the final binary e.g. Pointing the environment variable when running cargo run to target/release/build/whisper-rs-sys-f18f04168e33a9e5/out/whisper.cpp/; and then I get the expected: ggml_metal_init: GGML_METAL_PATH_RESOURCES = target/release/build/whisper-rs-sys-f18f04168e33a9e5/out/whisper.cpp/ Everything works fine and seems performant. I used a trivial example, so I did not really measure the performance or anything.

I think the variable needs to be set at runtime. From the code @dev-msp linked https://github.com/ggerganov/whisper.cpp/blob/ac283dbce7d42735e3ed985329037bf23fe180aa/ggml-metal.m#L333, I am not sure there is a way to properly fix this in build.rs -- although I am just really quickly skimming this codebase, and can't offer any informed ideas yet.

It seems this is more something that should be part of the WhisperContext. (Cf. I am trying to figure out how one can generate a statically linked shippable binary from the example @valiksb shared.)

P.S. I am on 14.0 on M1 Pro. @dev-msp -- I am guessing you are getting the same thing, but I am curious if there are any discrepancies.

P.S.2: https://github.com/ggerganov/llama.cpp/issues/5376 Perhaps related? And the build script should embed metallib? (i.e. via https://github.com/ggerganov/whisper.cpp/blob/ac283dbce7d42735e3ed985329037bf23fe180aa/ggml-metal.m#L322)

dev-msp commented 2 months ago

@eftychis Yes, getting same thing on same hardware, though I'm on 13.6. Also runs fine when the variable is set at runtime.

hlhr202 commented 1 month ago

My solution is to just copy the "ggml-metal.metal" file from out dir where whisper.cpp folder exists to the CWD but again GGML_METAL_LOG is not captured by whisper_log_trampoline in the rust side

hlhr202 commented 1 month ago

https://github.com/tazz4843/whisper-rs/pull/148 I have just created a naive fix. Anyone can have a try here?