[BUG]: Not loading cuda backend on laptop

wased89 commented 6 days ago

Description

on my laptop with a gtx 1060 in it, i have the cpu backend and the cuda backend installed. However the native all log does not show it even loading cuda, but instead shows it trying to load vulkan (not installed) and then eventually defaulting to avx2 and using cpu only. It also allows me to offload gpu layers, but never goes into the gpu buffer.

Reproduction Steps

https://pastebin.com/TmsUg95V this is the pastebin log

Environment & Configuration

Operating system: Windows 10
.NET runtime version: 7 (and 8 too)
LLamaSharp version: 19
CUDA version (if you are using cuda backend): 11
CPU & GPU device: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz

GTX 1060 6GB

Known Workarounds

No response

dschilling commented 4 days ago

I am experiencing the same issue on Ubuntu. When running the example application, even though the CUDA native library is present, it never checks to see if it exists, instead only looking for the Vulcan library.

daniel@reteep:~/code/LLamaSharp/LLama.Examples$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04.1 LTS
Release:        24.04
Codename:       noble
daniel@reteep:~/code/LLamaSharp/LLama.Examples$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0
daniel@reteep:~/code/LLamaSharp/LLama.Examples$ dotnet --version
8.0.110
daniel@reteep:~/code/LLamaSharp/LLama.Examples$ nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-bfdfe6b5-e9c2-ff48-bf26-22e2f02b9757)
daniel@reteep:~/code/LLamaSharp/LLama.Examples$ git status
HEAD detached at v0.19.0
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   Program.cs

no changes added to commit (use "git add" and/or "git commit -a")
daniel@reteep:~/code/LLamaSharp/LLama.Examples$ git diff
diff --git a/LLama.Examples/Program.cs b/LLama.Examples/Program.cs
index 63114120..58ebeb6e 100644
--- a/LLama.Examples/Program.cs
+++ b/LLama.Examples/Program.cs
@@ -18,7 +18,7 @@ AnsiConsole.MarkupLineInterpolated(
     """);

 // Configure logging. Change this to `true` to see log messages from llama.cpp
-var showLLamaCppLogs = false;
+var showLLamaCppLogs = true;
 NativeLibraryConfig
    .All
    .WithLogCallback((level, message) =>
@@ -37,4 +37,4 @@ NativeLibraryConfig
 // Calling this method forces loading to occur now.
 NativeApi.llama_empty_call();

-await ExampleRunner.Run();
\ No newline at end of file
+//await ExampleRunner.Run();
\ No newline at end of file
daniel@reteep:~/code/LLamaSharp/LLama.Examples$ dotnet run --framework net8.0
/home/daniel/code/LLamaSharp/LLama.Examples/Examples/QuantizeModel.cs(5,34): warning CS1998: This async method lacks 'await' operators and will run synchronously. Consider using the 'await' operator to await non-blocking API calls, or 'await Task.Run(...)' to do CPU-bound work on a background thread. [/home/daniel/code/LLamaSharp/LLama.Examples/LLama.Examples.csproj::TargetFramework=net8.0]
/home/daniel/code/LLamaSharp/LLama.Examples/Examples/SemanticKernelMemory.cs(18,17): warning CS0219: The variable 'seed' is assigned but its value is never used [/home/daniel/code/LLamaSharp/LLama.Examples/LLama.Examples.csproj::TargetFramework=net8.0]
/home/daniel/code/LLamaSharp/LLama.Examples/Examples/SemanticKernelHomeAutomation.cs(126,21): warning CS8602: Dereference of a possibly null reference. [/home/daniel/code/LLamaSharp/LLama.Examples/LLama.Examples.csproj::TargetFramework=net8.0]
/home/daniel/code/LLamaSharp/LLama.Examples/Examples/LlavaInteractiveModeExecute.cs(96,50): warning CS8602: Dereference of a possibly null reference. [/home/daniel/code/LLamaSharp/LLama.Examples/LLama.Examples.csproj::TargetFramework=net8.0]
======================================================================================================
 __       __                                       ____     __
/\ \     /\ \                                     /\  _`\  /\ \
\ \ \    \ \ \         __       ___ ___       __  \ \,\L\_\\ \ \___       __     _ __   _____
 \ \ \  __\ \ \  __  /'__`\   /' __` __`\   /'__`\ \/_\__ \ \ \  _ `\   /'__`\  /\` __\/\  __`\
  \ \ \L\ \\ \ \L\ \/\ \L\.\_ /\ \/\ \/\ \ /\ \L\.\_ /\ \L\ \\ \ \ \ \ /\ \L\.\_\ \ \/ \ \ \L\ \
   \ \____/ \ \____/\ \__/.\_\\ \_\ \_\ \_\\ \__/.\_\\ `\____\\ \_\ \_\\ \__/.\_\\ \_\  \ \ ,__/
    \/___/   \/___/  \/__/\/_/ \/_/\/_/\/_/ \/__/\/_/ \/_____/ \/_/\/_/ \/__/\/_/ \/_/   \ \ \/
========================================================================================= \ \_\ ======
                                                                                           \/_/

[llama Debug]: Beginning dry run for llama...
[llama Debug]: Loading library: 'llama'
[llama Info]: Detected OS Platform: 'LINUX'
[llama Debug]: Detected OS string: 'linux-x64'
[llama Debug]: Detected extension string: '.so'
[llama Debug]: Detected prefix string: 'lib'
[llama Info]: NativeLibraryConfig Description:
- LibraryName: LLama
- Path: ''
- PreferCuda: True
- PreferVulkan: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- SearchDirectories and Priorities: { ./ }
[llama Info]: NativeLibraryConfig Description:
- LibraryName: LLama
- Path: ''
- PreferCuda: True
- PreferVulkan: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- SearchDirectories and Priorities: { ./ }
[llama Debug]: Got relative library path 'runtimes/linux-x64/native/vulkan/libllama.so' from local with (NativeLibraryName: LLama, UseCuda: False, UseVulkan: True, AvxLevel: None), trying to load it...
[llama Debug]: Found full path file '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libggml.so' for relative path 'runtimes/linux-x64/native/vulkan/libggml.so'
[llama Info]: Successfully loaded '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libggml.so'
[llama Info]: Successfully loaded dependency 'runtimes/linux-x64/native/vulkan/libggml.so'
[llama Debug]: Found full path file '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libllama.so' for relative path 'runtimes/linux-x64/native/vulkan/libllama.so'
[llama Info]: Successfully loaded '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libllama.so'
[llama Debug]: Beginning dry run for llava_shared...
[llama Debug]: Loading library: 'llava_shared'
[llama Info]: Detected OS Platform: 'LINUX'
[llama Debug]: Detected OS string: 'linux-x64'
[llama Debug]: Detected extension string: '.so'
[llama Debug]: Detected prefix string: 'lib'
[llama Info]: NativeLibraryConfig Description:
- LibraryName: LLava
- Path: ''
- PreferCuda: True
- PreferVulkan: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- SearchDirectories and Priorities: { ./ }
[llama Info]: NativeLibraryConfig Description:
- LibraryName: LLava
- Path: ''
- PreferCuda: True
- PreferVulkan: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- SearchDirectories and Priorities: { ./ }
[llama Debug]: Got relative library path 'runtimes/linux-x64/native/vulkan/libllava_shared.so' from local with (NativeLibraryName: LLava, UseCuda: False, UseVulkan: True, AvxLevel: None), trying to load it...
[llama Debug]: Found full path file '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libggml.so' for relative path 'runtimes/linux-x64/native/vulkan/libggml.so'
[llama Info]: Successfully loaded '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libggml.so'
[llama Info]: Successfully loaded dependency 'runtimes/linux-x64/native/vulkan/libggml.so'
[llama Debug]: Found full path file '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libllava_shared.so' for relative path 'runtimes/linux-x64/native/vulkan/libllava_shared.so'
[llama Info]: Successfully loaded '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libllava_shared.so'
[llama Debug]: Loading library: 'llama'
[llama Info]: Detected OS Platform: 'LINUX'
[llama Debug]: Detected OS string: 'linux-x64'
[llama Debug]: Detected extension string: '.so'
[llama Debug]: Detected prefix string: 'lib'
[llama Info]: NativeLibraryConfig Description:
- LibraryName: LLama
- Path: ''
- PreferCuda: True
- PreferVulkan: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- SearchDirectories and Priorities: { ./ }
[llama Info]: NativeLibraryConfig Description:
- LibraryName: LLama
- Path: ''
- PreferCuda: True
- PreferVulkan: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- SearchDirectories and Priorities: { ./ }
[llama Debug]: Got relative library path 'runtimes/linux-x64/native/vulkan/libllama.so' from local with (NativeLibraryName: LLama, UseCuda: False, UseVulkan: True, AvxLevel: None), trying to load it...
[llama Debug]: Found full path file '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libggml.so' for relative path 'runtimes/linux-x64/native/vulkan/libggml.so'
[llama Info]: Successfully loaded '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libggml.so'
[llama Info]: Successfully loaded dependency 'runtimes/linux-x64/native/vulkan/libggml.so'
[llama Debug]: Found full path file '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libllama.so' for relative path 'runtimes/linux-x64/native/vulkan/libllama.so'
[llama Info]: Successfully loaded '/home/daniel/code/LLamaSharp/LLama.Examples/bin/Debug/net8.0/runtimes/linux-x64/native/vulkan/libllama.so'

dschilling commented 4 days ago

My previous comment showed the issue when running the example program in the LLamaSharp source code. Below is another demonstration of the issue, but this time using the Nuget packages - using the same environment as before (Ubuntu with CUDA 12).

LLamaSharpIssue990.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net8.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="LLamaSharp" Version="0.19.0" />
    <PackageReference Include="LLamaSharp.Backend.Cuda12" Version="0.19.0" />
  </ItemGroup>

</Project>

Program.cs

using LLama.Native;
NativeLibraryConfig.All
    .WithCuda()
    .WithLogCallback((level, message) => Console.Write($"{level}: {message}"));
NativeApi.llama_empty_call();

output

daniel@reteep:~/code/LLamaSharpIssue990$ dotnet run
Debug: Loading library: 'llama'
Info: Detected OS Platform: 'LINUX'
Debug: Detected OS string: 'linux-x64'
Debug: Detected extension string: '.so'
Debug: Detected prefix string: 'lib'
Info: NativeLibraryConfig Description:
- LibraryName: LLama
- Path: ''
- PreferCuda: True
- PreferVulkan: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- SearchDirectories and Priorities: { ./ }
Info: NativeLibraryConfig Description:
- LibraryName: LLama
- Path: ''
- PreferCuda: True
- PreferVulkan: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- SearchDirectories and Priorities: { ./ }
Debug: Got relative library path 'runtimes/linux-x64/native/vulkan/libllama.so' from local with (NativeLibraryName: LLama, UseCuda: False, UseVulkan: True, AvxLevel: None), trying to load it...
Debug: Found full path file 'runtimes/linux-x64/native/vulkan/libggml.so' for relative path 'runtimes/linux-x64/native/vulkan/libggml.so'
Info: Failed Loading 'runtimes/linux-x64/native/vulkan/libggml.so'
Info: Failed loading dependency 'runtimes/linux-x64/native/vulkan/libggml.so'
Debug: Found full path file 'runtimes/linux-x64/native/vulkan/libllama.so' for relative path 'runtimes/linux-x64/native/vulkan/libllama.so'
Info: Failed Loading 'runtimes/linux-x64/native/vulkan/libllama.so'
Debug: Got relative library path 'runtimes/linux-x64/native/avx2/libllama.so' from local with (NativeLibraryName: LLama, UseCuda: False, UseVulkan: False, AvxLevel: Avx2), trying to load it...
Debug: Found full path file 'runtimes/linux-x64/native/avx2/libggml.so' for relative path 'runtimes/linux-x64/native/avx2/libggml.so'
Info: Failed Loading 'runtimes/linux-x64/native/avx2/libggml.so'
Info: Failed loading dependency 'runtimes/linux-x64/native/avx2/libggml.so'
Debug: Found full path file 'runtimes/linux-x64/native/avx2/libllama.so' for relative path 'runtimes/linux-x64/native/avx2/libllama.so'
Info: Failed Loading 'runtimes/linux-x64/native/avx2/libllama.so'
Debug: Got relative library path 'runtimes/linux-x64/native/avx/libllama.so' from local with (NativeLibraryName: LLama, UseCuda: False, UseVulkan: False, AvxLevel: Avx), trying to load it...
Debug: Found full path file 'runtimes/linux-x64/native/avx/libggml.so' for relative path 'runtimes/linux-x64/native/avx/libggml.so'
Info: Failed Loading 'runtimes/linux-x64/native/avx/libggml.so'
Info: Failed loading dependency 'runtimes/linux-x64/native/avx/libggml.so'
Debug: Found full path file 'runtimes/linux-x64/native/avx/libllama.so' for relative path 'runtimes/linux-x64/native/avx/libllama.so'
Info: Failed Loading 'runtimes/linux-x64/native/avx/libllama.so'
Debug: Got relative library path 'runtimes/linux-x64/native/libllama.so' from local with (NativeLibraryName: LLama, UseCuda: False, UseVulkan: False, AvxLevel: None), trying to load it...
Debug: Found full path file 'runtimes/linux-x64/native/libggml.so' for relative path 'runtimes/linux-x64/native/libggml.so'
Info: Failed Loading 'runtimes/linux-x64/native/libggml.so'
Info: Failed loading dependency 'runtimes/linux-x64/native/libggml.so'
Debug: Found full path file 'runtimes/linux-x64/native/libllama.so' for relative path 'runtimes/linux-x64/native/libllama.so'
Info: Failed Loading 'runtimes/linux-x64/native/libllama.so'
Debug: Got relative library path 'runtimes/linux-x64/native/libllama.so' from local with (NativeLibraryName: LLama, UseCuda: False, UseVulkan: False, AvxLevel: None), trying to load it...
Debug: Found full path file 'runtimes/linux-x64/native/libggml.so' for relative path 'runtimes/linux-x64/native/libggml.so'
Info: Failed Loading 'runtimes/linux-x64/native/libggml.so'
Info: Failed loading dependency 'runtimes/linux-x64/native/libggml.so'
Debug: Found full path file 'runtimes/linux-x64/native/libllama.so' for relative path 'runtimes/linux-x64/native/libllama.so'
Info: Failed Loading 'runtimes/linux-x64/native/libllama.so'
Warning: No library was loaded before calling native apis. This is not an error under netstandard2.0 but needs attention with net6 or higher.
Unhandled exception. System.TypeInitializationException: The type initializer for 'LLama.Native.NativeApi' threw an exception.
 ---> LLama.Exceptions.RuntimeError: The native library cannot be correctly loaded. It could be one of the following reasons: 
1. No LLamaSharp backend was installed. Please search LLamaSharp.Backend and install one of them. 
2. You are using a device with only CPU but installed cuda backend. Please install cpu backend instead. 
3. One of the dependency of the native library is missed. Please use `ldd` on linux, `dumpbin` on windows and `otool`to check if all the dependency of the native library is satisfied. Generally you could find the libraries under your output folder.
4. Try to compile llama.cpp yourself to generate a libllama library, then use `LLama.Native.NativeLibraryConfig.WithLibrary` to specify it at the very beginning of your code. For more information about compilation, please refer to LLamaSharp repo on github.

   at LLama.Native.NativeApi..cctor()
   --- End of inner exception stack trace ---
   at LLama.Native.NativeApi.llama_empty_call()
   at Program.<Main>$(String[] args) in /home/daniel/code/LLamaSharpIssue990/Program.cs:line 5

Here are the resulting libraries from installing the LLamaSharp.Backend.Cuda12 package:

daniel@reteep:~/code/LLamaSharpIssue990/bin/Debug/net8.0/runtimes/linux-x64$ tree .
.
└── native
    └── cuda12
        ├── libggml.so
        ├── libllama.so
        └── libllava_shared.so

3 directories, 3 files

Note in the program output how LLamaSharp looked for the Vulkan libraries (and did not find them because they are not there) but never looked for the CUDA 12 libraries.

dschilling commented 4 days ago

Adding .SkipCheck(true).WithAutoFallback(false) allows the CUDA library to load on my machine.

using LLama.Native;
NativeLibraryConfig.All
    .WithCuda()
    .SkipCheck(true)
    .WithAutoFallback(false)
    .WithLogCallback((level, message) => Console.Write($"{level}: {message}"));
NativeApi.llama_empty_call();

On my system, it was not working because I installed nvidia-cuda-toolkit from the default Ubuntu repositories, which does not create a /usr/local/bin/cuda or /usr/local/cuda folder, and does not create a version.json file, so SystemInfo.GetCudaMajorVersion was unable to determine which version of CUDA was installed.

Maybe LLamaSharp could attempt to run nvcc --version if this version file cannot be found.

@wased89 - have you installed the NVIDIA CUDA Toolkit? It appears that LLamaSharp attempts to determine the CUDA version by looking at those files, and if it can't determine the version, it will try to fall back to a different library: Vulcan, etc.

SciSharp / LLamaSharp