cocktailpeanut / dalai

The simplest way to run LLaMA on your local machine
https://cocktailpeanut.github.io/dalai
13.09k stars 1.42k forks source link

Issue on Xeon CPU #248

Open bigbambu opened 1 year ago

bigbambu commented 1 year ago

Hi all, I have been trying all the different linux OS and also with multiple config tweaking but without any positive result. I always receive the following error.

cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mavx -msse3 -c ggml.c -o ggml.o In file included from /usr/lib/gcc/x86_64-redhat-linux/11/include/immintrin.h:101, from ggml.c:155: ggml.c: In function ‘ggml_vec_dot_f16’: /usr/lib/gcc/x86_64-redhat-linux/11/include/f16cintrin.h:52:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’: target specific option mismatch Here is the lscpu

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Vendor ID: GenuineIntel BIOS Vendor ID: Intel Model name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz BIOS Model name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz CPU family: 6 Model: 45 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 2 Stepping: 7 CPU max MHz: 3000.0000 CPU min MHz: 1200.0000 BogoMIPS: 5000.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1g b rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl v mx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm pti ssbd ibrs ibpb stib p tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 384 KiB (12 instances) L1i: 384 KiB (12 instances) L2: 3 MiB (12 instances) L3: 30 MiB (2 instances) NUMA: NUMA node(s): 2 NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23 Vulnerabilities: Itlb multihit: KVM: Mitigation: VMX disabled L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable Mds: Mitigation; Clear CPU buffers; SMT vulnerable Meltdown: Mitigation; PTI Mmio stale data: Not affected Retbleed: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected Srbds: Not affected Tsx async abort: Not affected

Any suggestion will be much appreciated.

ibuk01 commented 1 year ago

I do have the exact same issue on my DL380 HomeLabServer (2x Xeon E5-2650). I used Kubernetes/ContainerD and Docker too using the docker-compose.yaml file to build the container-image for Kubernetes. The same image(!) runs on any other CPU but fails on dual-XEON.