Open Pathos14489 opened 10 months ago
@Pathos14489 did the first version also crash? I don't think changing cuda visible devices works or at least I'm not sure what the behaviour would be wrt llama.cpp cuda internals.
Ah sorry I didn't notice the response. Yes, both versions crashed.
Edit: By the way, CUDA_VISIBLE_DEVICES used in that way does work as intended when only one model is being loaded.
Have you managed to get the script working?
Expected Behavior
I wanted to load two models at once to swap between them based on speed or quality. Here's the related code:
When this did not work, I also tried the following edit, which similarly did not work:
Current Behavior
Crashes with the following message: CUDA error 400 at /tmp/pip-install-l820kql8/llama-cpp-python_b22d62cce7f540a0ae17b83dd03f27d3/vendor/llama.cpp/ggml-cuda.cu:7308: invalid resource handle current device: 1
Environment and Context
Ryzen 9 5950x, 64GB of DDR4 2333mhz
$ lscpu
-> Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Vendor ID: AuthenticAMD Model name: AMD Ryzen 9 5950X 16-Core Processor CPU family: 25 Model: 33 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 Stepping: 2 Frequency boost: enabled CPU max MHz: 5083.3979 CPU min MHz: 2200.0000 BogoMIPS: 6800.56 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsave opt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm Virtualization features: Virtualization: AMD-V Caches (sum of all):L1d: 512 KiB (16 instances) L1i: 512 KiB (16 instances) L2: 8 MiB (16 instances) L3: 64 MiB (2 instances) NUMA:
NUMA node(s): 1 NUMA node0 CPU(s): 0-31 Vulnerabilities:
Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Not affected Retbleed: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected Srbds: Not affected Tsx async abort: Not affected
$ uname -a
-> Linux pathos-mint 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux