no cuda implementation for rms-norm

huggingface / candle

Minimalist ML framework for Rust

Apache License 2.0

15.48k stars 916 forks source link

no cuda implementation for rms-norm #1916

Closed Pox-here closed 6 months ago

Pox-here commented 6 months ago

Running quantized mistral,

avx: false, neon: false, simd128: false, f16c: false temp: 0.80 repeat-penalty: 1.10 repeat-last-n: 64 loaded 291 tensors (3.08GB) in 0.03s Current device: Cuda(CudaDevice(DeviceId(1))) model built successfully

When attempting inference, experiencing following issue:

Error: Cuda("no cuda implementation for rms-norm")

Is this expected and something that will be introduced later, or is there an issue here? I pulled latest main, including the one fixing the issue regarding "not a f64 F32(1e-5)", which I first encountered: https://github.com/huggingface/candle/pull/1913

Any suggestions or statements regarding the missing implementation for rms-norm would be appreciated, tnx

LaurentMazare commented 6 months ago

Could you provide more details about what you ran? The following model works all well for me

cargo run --profile=release-with-debug --features cuda --example mistral -- --quantized --prompt "Hello "

The main thing that could trigger the error you're seing is the candle-nn create not having the cuda feature enabled but that should be the case if you're enabling the cuda feature flag for anything in candle-examples.

Pox-here commented 6 months ago

I am not running the example, its a modified code using mistral based model for inference.

However I located the solution based on your feedback. In my Cargo.toml:

- candle-nn = { git = "https://github.com/huggingface/candle.git" } + candle-nn = { git = "https://github.com/huggingface/candle.git", features = ["cuda"] }

I didnt need to add any flag, and successfully loads and runs the quantized mistral based model using candle. Tnx,

Closing

sidharthrajaram commented 3 months ago

I fixed this by ensuring cuda features were enabled for both candle-nn and candle-core like:

candle-core = { git = "https://github.com/huggingface/candle.git", version = "0.6.0", features = ["cuda"] }
candle-nn = { git = "https://github.com/huggingface/candle.git", version = "0.6.0", features = ["cuda"] }

Pox-here commented 3 months ago

I fixed this by ensuring cuda features were enabled for both candle-nn and candle-core like:
candle-core = { git = "https://github.com/huggingface/candle.git", version = "0.6.0", features = ["cuda"] }

candle-nn = { git = "https://github.com/huggingface/candle.git", version = "0.6.0", features = ["cuda"] }
@sidharthrajaram

Yes thats correct, if you see my previous comment, I stated this too, but in a slight unclear matter. I take self criticism for this. Tnx