Closed Kuinox closed 2 months ago
There is also a different(?) issue when trying to run with OpenCL.
./llama.cpp/ggml-opencl.cpp(10): fatal error C1083: Cannot open include file: 'clblast.h': No such file or directory
When enabling the cuda feature, I get the following error on windows:
[...] running: "nvcc" "-O0" "-ffunction-sections" "-fdata-sections" "-g" "-fno-omit-frame-pointer" "-m64" "-I" "./llama.cpp/ggml-cuda.h" "-Wall" "-Wextra" "--forward-unknown-to-host-compiler" "-arch=native" "/W4" "/Wall" "/wd4820" "/wd4710" "/wd4711" "/wd4820" "/wd4514" "-DGGML_USE_CUBLAS" "-DGGML_CUDA_DMMV_X=32" "-DGGML_CUDA_DMMV_Y=1" "-DK_QUANTS_PER_ITERATION=2" "-Wno-pedantic" "-o" "C:\\dev\\ai_kuinox\\target\\debug\\build\\llama_cpp_rs-dbbb5a5dac5f7f5e\\out\\./llama.cpp/ggml-cuda.o" "-c" "./llama.cpp/ggml-cuda.cu" nvcc fatal : A single input file is required for a non-link phase when an outputfile is specified exit code: 1
are you running this in docker ? if so are you using the provided dockerfile ?
No, I'm running this directly on Windows.
No, I'm running this directly on Windows.
Can you try running using the docker image ? i have never tested by running directly so i have no idea if it works or not.
I've been trying to get this working with Tauri, and it seems to work with CPU only, but fails on CUDA for windows. I attempted to compile it manually and that seemed more successful. I think the issue is with the compiler flags used for nvcc, but I can't really confirm if this is the case.
I've been trying to get this working with Tauri, and it seems to work with CPU only, but fails on CUDA for windows. I attempted to compile it manually and that seemed more successful. I think the issue is with the compiler flags used for nvcc, but I can't really confirm if this is the case.
I will try to look into it when i get some time and see how to directly compile on windows. Can you share me your steps on how you directly compiled cuda?
I will try to look into it when i get some time and see how to directly compile on windows. Can you share me your steps on how you directly compiled cuda?
I did some more testing and adding / removing flags actually helped with compiling on Windows, and I modified compile_cuda as such:
fn compile_cuda(cxx_flags: &str) {
if cfg!(target_os = "linux") || cfg!(target_os = "macos") {
println!("cargo:rustc-link-search=native=/usr/local/cuda/lib64");
println!("cargo:rustc-link-search=native=/opt/cuda/lib64");
if let Ok(cuda_path) = std::env::var("CUDA_PATH") {
println!(
"cargo:rustc-link-search=native={}/targets/x86_64-linux/lib",
cuda_path
);
}
} else if cfg!(target_os = "windows") {
if let Ok(cuda_path) = std::env::var("CUDA_PATH") {
println!(
"cargo:rustc-link-search=native={}/lib/x64",
cuda_path
);
}
}
let mut libs = String::from("cublas cudart cublasLt");
if cfg!(target_os = "linux") || cfg!(target_os = "macos"){
libs.push_str(" culibos pthread dl rt")
}
for lib in libs.split_whitespace() {
println!("cargo:rustc-link-lib={}", lib);
}
let mut nvcc = cc::Build::new();
let env_flags = vec![
("LLAMA_CUDA_DMMV_X=32", "-DGGML_CUDA_DMMV_X"),
("LLAMA_CUDA_DMMV_Y=1", "-DGGML_CUDA_DMMV_Y"),
("LLAMA_CUDA_KQUANTS_ITER=2", "-DK_QUANTS_PER_ITERATION"),
];
let nvcc_flags = "--forward-unknown-to-host-compiler -arch=native";
for nvcc_flag in nvcc_flags.split_whitespace() {
nvcc.flag(nvcc_flag);
}
for cxx_flag in cxx_flags.split_whitespace() {
if cxx_flag.find("\\") == Some(0) {
nvcc.flag(cxx_flag);
}
else {
nvcc.ar_flag(cxx_flag);
}
}
for env_flag in env_flags {
let mut flag_split = env_flag.0.split("=");
if let Ok(val) = std::env::var(flag_split.next().unwrap()) {
nvcc.flag(&format!("{}={}", env_flag.1, val));
} else {
nvcc.flag(&format!("{}={}", env_flag.1, flag_split.next().unwrap()));
}
}
nvcc.compiler("nvcc")
.file("./llama.cpp/ggml-cuda.cu")
.no_default_flags(true)
.extra_warnings(false)
//.flag("-Wno-pedantic")
.ar_flag("/NODEFAULTLIB:libcmt.lib")
.include("./llama.cpp/ggml-cuda.h")
.compile("ggml-cuda");
}
Plus, these flags are added as ar_flag
instead of flag
, though I am not sure if this actually achieves anything:
cx_flags.push_str(" /W4 /Wall /wd4820 /wd4710 /wd4711 /wd4820 /wd4514 /MT");
cxx_flags.push_str(" /W4 /Wall /wd4820 /wd4710 /wd4711 /wd4820 /wd4514 /MT");
Disclaimer: I have close to 0 experience in Rust and building C / C++, I was essentially just knocking rocks together and seeing what works.
Also, as an added bonus, OpenBlas is also completely broken for Windows! I gave up on trying to get this to work. Good luck!
Any update on this? Getting the same error. I tried writing my own docker file using the same base image as the CUDA example and it works fine since the docker image uses Ubuntu; Windows support in general appears to be missing. I haven't tested it but providing a quick patch like the person above me suggested would be helpful, having to compile and run my Cargo project from a docker container is a bit painful in my current configuration.
Any update on this? Getting the same error. I tried writing my own docker file using the same base image as the CUDA example and it works fine since the docker image uses Ubuntu; Windows support in general appears to be missing. I haven't tested it but providing a quick patch like the person above me suggested would be helpful, having to compile and run my Cargo project from a docker container is a bit painful in my current configuration.
Hey i wasnt getting any time due to my job i will try to look into this asap.
Windows support in general appears to be missing.
My fork has an updated build.rs
that enables cuda on windows and metal on mac. you might be able to figure out what you need for windows from my commit, @FlooferLand. It's not the cleanest thing I've ever written, but it seems to work.
https://github.com/mdrokz/rust-llama.cpp/commit/ae49e02ff667b329271f1975d78298571f7f8b76
@tbogdala Tried out your patch! Left me very confused xD I'll just leave the log here: build.txt
@tbogdala Tried out your patch! Left me very confused xD I'll just leave the log here: build.txt
I received a somewhat related issue and it seems like the commit is trying to run a .sh file on Windows which of course does not work as expected. I was able to bypass the error by removing the mentions to the .sh script and making a function that does what the script tries to do. However, this just ended up causing issues when using llama_cpp_rs as a dependency in a different project, so it is not a good solution. I'll have to take a deeper look into it.
When enabling the cuda feature, I get the following error on windows: