[X] I have done my due diligence in trying to find the answer myself.
Topic
The Rust implementation
Question
I am getting cuda out of memory. I am running q8 version on wsl, Ubuntu, RTX 4060 with 8g vram. I thought the hardware could run the quantized version. Am I doing something wrong? Please help. (I also tried cuda_compute_cap with other lower numbers and still same problem)
CUDA_COMPUTE_CAP=86 cargo run --features cuda --bin moshi-backend -r -- --co
nfig moshi-backend/config-q8.json standalone
Finished release profile [optimized + debuginfo] target(s) in 1m 15s
Running target/release/moshi-backend --config moshi-backend/config-q8.json standalone
2024-09-29T20:03:12.168129Z INFO moshi_backend: build_info=BuildInfo { build_timestamp: "2024-09-22T23:05:21.856959080Z", build_date: "2024-09-22", git_branch: "main", git_timestamp: "2024-09-21T17:30:23.000000000+02:00", git_date: "2024-09-21", git_hash: "3e3e573b28a1d1d6be084185e1a2e6e550c1ddcf", git_describe: "3e3e573", rustc_host_triple: "x86_64-unknown-linux-gnu", rustc_version: "1.81.0", cargo_target_triple: "x86_64-unknown-linux-gnu" }
2024-09-29T20:03:12.168212Z INFO moshi_backend: starting process with pid 30709
Error: DriverError(CUDA_ERROR_OUT_OF_MEMORY, "out of memory")
Due diligence
Topic
The Rust implementation
Question
I am getting cuda out of memory. I am running q8 version on wsl, Ubuntu, RTX 4060 with 8g vram. I thought the hardware could run the quantized version. Am I doing something wrong? Please help. (I also tried cuda_compute_cap with other lower numbers and still same problem)
CUDA_COMPUTE_CAP=86 cargo run --features cuda --bin moshi-backend -r -- --co nfig moshi-backend/config-q8.json standalone
Finished
release
profile [optimized + debuginfo] target(s) in 1m 15s Runningtarget/release/moshi-backend --config moshi-backend/config-q8.json standalone
2024-09-29T20:03:12.168129Z INFO moshi_backend: build_info=BuildInfo { build_timestamp: "2024-09-22T23:05:21.856959080Z", build_date: "2024-09-22", git_branch: "main", git_timestamp: "2024-09-21T17:30:23.000000000+02:00", git_date: "2024-09-21", git_hash: "3e3e573b28a1d1d6be084185e1a2e6e550c1ddcf", git_describe: "3e3e573", rustc_host_triple: "x86_64-unknown-linux-gnu", rustc_version: "1.81.0", cargo_target_triple: "x86_64-unknown-linux-gnu" } 2024-09-29T20:03:12.168212Z INFO moshi_backend: starting process with pid 30709Error: DriverError(CUDA_ERROR_OUT_OF_MEMORY, "out of memory")