Closed trrahul closed 1 year ago
Please install LLamaSharp.Backend.Cuda11
or LLamaSharp.Backend.Cuda12
at first, then set n_gpu_layers
to a positive number.
Generally, if nothing goes wrong, there will be an output like this "offload 20 layers to gpu"
I installed the cuda package on to the Llama.Examples project and set the n_gpu_layers to 100. Still it is not using GPU.
Could you please provide the outputs when you run the program? (the output when loading the model) Besides, it'e better if your system info is available here.
Sure.
0: Run a chat session.
1: Run a LLamaModel to chat.
2: Quantize a model.
3: Get the embeddings of a message.
4: Run a LLamaModel with instruct mode.
5: Load and save state of LLamaModel.
Your choice: 0
Please input your model path: C:\Users\rahul\Downloads\wizardLM-7B.ggmlv3.q4_0.bin
llama.cpp: loading model from C:\Users\rahul\Downloads\wizardLM-7B.ggmlv3.q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32001
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.07 MB
llama_model_load_internal: mem required = 5407.72 MB (+ 1026.00 MB per state)
.
llama_init_from_file: kv self size = 256.00 MB
---------------
Display Devices
---------------
Card name: NVIDIA GeForce GTX 1650
Manufacturer: NVIDIA
Chip type: GeForce GTX 1650
DAC type: Integrated RAMDAC
Device Type: Full Device (POST)
Device Key: Enum\PCI\VEN_10DE&DEV_1F91&SUBSYS_3FFB17AA&REV_A1
Device Status: 0180200A [DN_DRIVER_LOADED|DN_STARTED|DN_DISABLEABLE|DN_NT_ENUMERATOR|DN_NT_DRIVER]
Device Problem Code: No Problem
Driver Problem Code: Unknown
Display Memory: 12109 MB
Dedicated Memory: 3962 MB
Shared Memory: 8147 MB
Current Mode: 1920 x 1080 (32 bit) (60Hz)
HDR Support: Not Supported
Display Topology: Internal
Display Color Space: DXGI_COLOR_SPACE_RGB_FULL_G22_NONE_P709
Color Primaries: Red(0.589844,0.370117), Green(0.349609,0.554688), Blue(0.155273,0.110352), White Point(0.313477,0.329102)
Display Luminance: Min Luminance = 0.500000, Max Luminance = 270.000000, MaxFullFrameLuminance = 270.000000
Monitor Name: Generic PnP Monitor
Monitor Model: unknown
Monitor Id: LGD05E5
Native Mode: 1920 x 1080(p) (59.977Hz)
Output Type: Displayport Embedded
Monitor Capabilities: HDR Not Supported
Display Pixel Format: DISPLAYCONFIG_PIXELFORMAT_32BPP
Advanced Color: Not Supported
Driver Name: C:\WINDOWS\System32\DriverStore\FileRepository\nvlt.inf_amd64_5adc6075318430cf\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvlt.inf_amd64_5adc6075318430cf\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvlt.inf_amd64_5adc6075318430cf\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvlt.inf_amd64_5adc6075318430cf\nvldumdx.dll
Driver File Version: 27.21.0014.6230 (English)
Driver Version: 27.21.14.6230
DDI Version: 12
Feature Levels: 12_1,12_0,11_1,11_0,10_1,10_0,9_3,9_2,9_1
Driver Model: WDDM 2.7
Hardware Scheduling: Supported:True Enabled:False
Graphics Preemption: Pixel
Compute Preemption: Dispatch
Miracast: Not Supported
Detachable GPU: No
Hybrid Graphics GPU: Not Supported
Power P-states: Not Supported
Virtualization: Paravirtualization
Block List: No Blocks
Catalog Attributes: Universal:False Declarative:True
Driver Attributes: Final Retail
Driver Date/Size: 05-04-2021 05:30:00 AM, 1049064 bytes
WHQL Logo'd: n/a
WHQL Date Stamp: n/a
Device Identifier: {D7B71E3E-5CD1-11CF-4C6C-F51F1BC2D635}
Vendor ID: 0x10DE
Device ID: 0x1F91
SubSys ID: 0x3FFB17AA
Revision ID: 0x00A1
Driver Strong Name: oem7.inf:0f066de306961689:Section171:27.21.14.6230:pci\ven_10de&dev_1f91&subsys_3ffb17aa
Rank Of Driver: 00CF0001
Video Accel:
DXVA2 Modes: {86695F12-340E-4F04-9FD3-9253DD327460} DXVA2_ModeMPEG2_VLD {6F3EC719-3735-42CC-8063-65CC3CB36616} DXVA2_ModeVC1_D2010 DXVA2_ModeVC1_VLD {32FCFE3F-DE46-4A49-861B-AC71110649D5} DXVA2_ModeH264_VLD_Stereo_Progressive_NoFGT DXVA2_ModeH264_VLD_Stereo_NoFGT DXVA2_ModeH264_VLD_NoFGT DXVA2_ModeHEVC_VLD_Main DXVA2_ModeHEVC_VLD_Main10 {20BB8B0A-97AA-4571-8E99-64E60606C1A6} {15DF9B21-06C4-47F1-841E-A67C97D7F312} DXVA2_ModeMPEG4pt2_VLD_Simple DXVA2_ModeMPEG4pt2_VLD_AdvSimple_NoGMC {9947EC6F-689B-11DC-A320-0019DBBC4184} {33FCFE41-DE46-4A49-861B-AC71110649D5} DXVA2_ModeVP9_VLD_Profile0 DXVA2_ModeVP9_VLD_10bit_Profile2 {DDA19DC7-93B5-49F5-A9B3-2BDA28A2CE6E} {6AFFD11E-1D96-42B1-A215-93A31F09A53D} {914C84A3-4078-4FA9-984C-E2F262CB5C9C}
Deinterlace Caps: {6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(YUY2,YUY2) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_PixelAdaptive
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(YUY2,YUY2) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(YUY2,YUY2) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(YUY2,YUY2) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_BOBVerticalStretch
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(UYVY,UYVY) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_PixelAdaptive
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(UYVY,UYVY) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(UYVY,UYVY) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(UYVY,UYVY) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_BOBVerticalStretch
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(YV12,0x32315659) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_PixelAdaptive
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(YV12,0x32315659) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(YV12,0x32315659) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(YV12,0x32315659) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_BOBVerticalStretch
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(NV12,0x3231564e) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_PixelAdaptive
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(NV12,0x3231564e) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(NV12,0x3231564e) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(NV12,0x3231564e) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_BOBVerticalStretch
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(IMC1,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(IMC1,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(IMC1,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(IMC1,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(IMC2,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(IMC2,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(IMC2,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(IMC2,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(IMC3,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(IMC3,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(IMC3,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(IMC3,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(IMC4,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(IMC4,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(IMC4,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(IMC4,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(S340,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(S340,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(S340,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(S340,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(S342,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(S342,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(S342,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(S342,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
D3D9 Overlay: Supported
DXVA-HD: Supported
DDraw Status: Enabled
D3D Status: Enabled
AGP Status: Enabled
MPO MaxPlanes: 4
MPO Caps: RGB,YUV,BILINEAR,HIGH_FILTER,STRETCH_YUV,STRETCH_RGB,IMMEDIATE,HDR (MPO3)
MPO Stretch: 10.000X - 0.500X
MPO Media Hints: resizing, colorspace Conversion
MPO Formats: NV12,R16G16B16A16_FLOAT,R10G10B10A2_UNORM,R8G8B8A8_UNORM,B8G8R8A8_UNORM
PanelFitter Caps: RGB,YUV,BILINEAR,HIGH_FILTER,STRETCH_YUV,STRETCH_RGB,IMMEDIATE,HDR (MPO3)
PanelFitter Stretch: 10.000X - 0.500X
What's your cuda version? Since you installed two packages at the same time, one of them will cover the other. Therefore the cuda version of your system and that of LLamaSharp.Backend
may be mismatched.
I have cuda 12 installed. Removed Cuda11 package and tested again. No luck.
My GPU's Cuda Compute Capability is 7.5 and I think it will only work with Cuda version 10. (https://stackoverflow.com/a/28933055)
Would you like to build from source (llama.cpp) to have a try? I could help you with that. I think if it could be built from source, then it should work in your computer.
Thank you. It was running on my CPU fine and I wanted to know how it switches to GPU and works. I saw the cuda dlls in the runtime folder but the code was
Shouldn't it be private const string libraryName = "libllama-cuda11";
to load the GPU version?
I changed it so and ran the example and that is when I got an interop exception which led me to believe there might be something wrong with my cuda version. I thought of compiling it myself but haven't got enough time to try it yet.
In LLamaSharp.Backend
package they are all renamed to libllama.dll
. When using master branch you need to change the filename to choose which dll you want to use. What does the interop exception look like?
I'll close this issue since it's been inactive for so long, but feel free to comment here if it's still an issue and I'll reopen it.
Setting n_gpu_layers has no effect? How do you run the example with GPU?