utilityai / llama-cpp-rs

132 stars 42 forks source link

vulkan build failed #454

Open hjiayz opened 1 month ago

hjiayz commented 1 month ago

gen_vulkan_shaders failed at

    Command::new(vulkan_shaders_gen_bin)
        .args([
            "--glslc".as_ref(), "glslc".as_ref(),
            "--input-dir".as_ref(), vulkan_shaders_src.as_os_str(),
            "--output-dir".as_ref(), out_path.as_ref().join("vulkan-shaders.spv").as_os_str(),
            "--target-hpp".as_ref(), header.as_os_str(),
            "--target-cpp".as_ref(), source.as_os_str(),
            "--no-clean".as_ref()
        ])
        .output().expect("Could not run Vulkan shader generator");

error happend at

cxx.to_command()
        .args([
            vulkan_shaders_src.join("vulkan-shaders-gen.cpp").as_os_str(),
            "-o".as_ref(), vulkan_shaders_gen_bin.as_os_str()
        ])
        .output().expect("Could not compile Vulkan shader generator");

this output error:

cc1plus.exe: fatal error: ./llama.cpp\ggml\src\vulkan-shaders\vulkan-shaders-gen.cpp: No such file or directory

dir ./llama.cpp\ggml\src\vulkan-shaders not exists dir ./llama.cpp\ggml\src\ggml-cuda exists

hjiayz commented 1 month ago

https://github.com/utilityai/llama-cpp-rs/blob/8c1430d82e1df0e1cff25f1fd822f3d723592685/llama-cpp-sys-2/Cargo.toml#L10 missing

MarcusDunn commented 3 weeks ago

a PR would be welcome! Vulkan is supported almost entirely by not me. If you want to ensure it stays not broken as llama-cpp updates, feel free to add a github workflow to test it.

thewh1teagle commented 1 week ago

Why the project uses cxx to compile llama.cpp? this way you need to manually handle a lot of things and maintain them. for instance https://github.com/utilityai/llama-cpp-rs/blob/main/llama-cpp-sys-2/build.rs#L654 This is handled by cmake fluently. is there any reason this project not uses cmake?

When compiling llama.cpp with vulkan there's no need to compile manually the shaders. all is needed is just to enable the flag

MarcusDunn commented 1 week ago

there was an attempt which was decided against in https://github.com/utilityai/llama-cpp-rs/pull/221. If I recall, I could not get static linking working which made building docker images (which is how we deployed this) tricky.

If someone is willing to bring that home, I'm fine accepting a PR. We (Dial AI) do not currently use this in our inference solution, so it's no longer a hard requirement.

thewh1teagle commented 1 week ago

there was an attempt which was decided against in #221. If I recall, I could not get static linking working which made building docker images (which is how we deployed this) tricky.

If someone is willing to bring that home, I'm fine accepting a PR. We (Dial AI) do not currently use this in our inference solution, so it's no longer a hard requirement.

I'll try to create new PR with cmake. Do you have speical requirement in the building? eg. some non standard compile with llama.cpp / hack? Can we remove the build with docker images (the things related to arm) and simplify the tests with matrix on (macos arm, macos intel, windows x86, linux x86) inference of usage example?

MarcusDunn commented 1 week ago

Cuda on Linux x86 is the only hard requirement (I currently test this manually as it's impossibly slow on the CI runners + #398)

There's a best effort to maintain CPU on Linux, everything else I'm unable to test outside of CI, so if you can get something clean working CI-wise, that's great!

Static linking I can go without, but I would prefer if its supported on the llama.cpp side of things.