Closed Gideonah closed 1 year ago
try cmake. maybe success
When trying with cmake everything appears to run fine but i still can't run the original make file. So the whole package still doesn't build. Or is there something else i'm meant to run after?
mkdir build cd build cmake .. -DLLAMA_CUBLAS=ON cmake --build . --config Release
sh-4.2$ cmake .. -DLLAMA_CUBLAS=ON
-- The C compiler identification is GNU 7.3.1
-- The CXX compiler identification is GNU 7.3.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.40.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- Found CUDAToolkit: /usr/local/cuda-11.8/include (found version "11.8.89")
-- cuBLAS found
-- The CUDA compiler identification is NVIDIA 11.8.89
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-11.8/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Using CUDA architectures: 52;61;70
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done (1.9s)
-- Generating done (0.1s)
-- Build files have been written to: /home/ec2-user/SageMaker/llama.cpp/llama.cpp/build
sh-4.2$ cmake --build . --config Release
[ 1%] Built target BUILD_INFO
[ 3%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 5%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 7%] Building CUDA object CMakeFiles/ggml.dir/ggml-cuda.cu.o
[ 8%] Building C object CMakeFiles/ggml.dir/k_quants.c.o
[ 8%] Built target ggml
[ 10%] Linking CUDA static library libggml_static.a
[ 10%] Built target ggml_static
[ 12%] Building CXX object CMakeFiles/llama.dir/llama.cpp.o
[ 14%] Linking CXX static library libllama.a
[ 14%] Built target llama
[ 15%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/test-quantize-fns.cpp.o
[ 17%] Linking CXX executable ../bin/test-quantize-fns
[ 17%] Built target test-quantize-fns
[ 19%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/test-quantize-perf.cpp.o
[ 21%] Linking CXX executable ../bin/test-quantize-perf
[ 21%] Built target test-quantize-perf
[ 22%] Building CXX object tests/CMakeFiles/test-sampling.dir/test-sampling.cpp.o
[ 24%] Linking CXX executable ../bin/test-sampling
[ 24%] Built target test-sampling
[ 26%] Building CXX object tests/CMakeFiles/test-tokenizer-0.dir/test-tokenizer-0.cpp.o
/home/ec2-user/SageMaker/llama.cpp/llama.cpp/tests/test-tokenizer-0.cpp:19:2: warning: extra ‘;’ [-Wpedantic]
};
^
[ 28%] Linking CXX executable ../bin/test-tokenizer-0
[ 28%] Built target test-tokenizer-0
[ 29%] Building CXX object tests/CMakeFiles/test-grammar-parser.dir/test-grammar-parser.cpp.o
[ 31%] Linking CXX executable ../bin/test-grammar-parser
[ 31%] Built target test-grammar-parser
[ 33%] Building CXX object tests/CMakeFiles/test-llama-grammar.dir/test-llama-grammar.cpp.o
[ 35%] Linking CXX executable ../bin/test-llama-grammar
[ 35%] Built target test-llama-grammar
[ 36%] Building CXX object tests/CMakeFiles/test-grad0.dir/test-grad0.cpp.o
[ 38%] Linking CXX executable ../bin/test-grad0
[ 38%] Built target test-grad0
[ 40%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o
[ 42%] Building CXX object examples/CMakeFiles/common.dir/console.cpp.o
[ 43%] Building CXX object examples/CMakeFiles/common.dir/grammar-parser.cpp.o
[ 43%] Built target common
[ 45%] Building CXX object examples/main/CMakeFiles/main.dir/main.cpp.o
[ 47%] Linking CXX executable ../../bin/main
[ 47%] Built target main
[ 49%] Building CXX object examples/quantize/CMakeFiles/quantize.dir/quantize.cpp.o
[ 50%] Linking CXX executable ../../bin/quantize
[ 50%] Built target quantize
[ 52%] Building CXX object examples/quantize-stats/CMakeFiles/quantize-stats.dir/quantize-stats.cpp.o
[ 54%] Linking CXX executable ../../bin/quantize-stats
[ 54%] Built target quantize-stats
[ 56%] Building CXX object examples/perplexity/CMakeFiles/perplexity.dir/perplexity.cpp.o
[ 57%] Linking CXX executable ../../bin/perplexity
[ 57%] Built target perplexity
[ 59%] Building CXX object examples/embedding/CMakeFiles/embedding.dir/embedding.cpp.o
[ 61%] Linking CXX executable ../../bin/embedding
[ 61%] Built target embedding
[ 63%] Building CXX object examples/save-load-state/CMakeFiles/save-load-state.dir/save-load-state.cpp.o
[ 64%] Linking CXX executable ../../bin/save-load-state
[ 64%] Built target save-load-state
[ 66%] Building CXX object examples/benchmark/CMakeFiles/benchmark.dir/benchmark-matmult.cpp.o
[ 68%] Linking CXX executable ../../bin/benchmark
[ 68%] Built target benchmark
[ 70%] Building CXX object examples/baby-llama/CMakeFiles/baby-llama.dir/baby-llama.cpp.o
/home/ec2-user/SageMaker/llama.cpp/llama.cpp/examples/baby-llama/baby-llama.cpp: In function ‘int main(int, char**)’:
/home/ec2-user/SageMaker/llama.cpp/llama.cpp/examples/baby-llama/baby-llama.cpp:1620:32: warning: variable ‘opt_params_adam’ set but not used [-Wunused-but-set-variable]
struct ggml_opt_params opt_params_adam = ggml_opt_default_params(GGML_OPT_ADAM);
^~~~~~~
[ 71%] Linking CXX executable ../../bin/baby-llama
[ 71%] Built target baby-llama
[ 73%] Building CXX object examples/train-text-from-scratch/CMakeFiles/train-text-from-scratch.dir/train-text-from-scratch.cpp.o
[ 75%] Linking CXX executable ../../bin/train-text-from-scratch
[ 75%] Built target train-text-from-scratch
[ 77%] Building CXX object examples/convert-llama2c-to-ggml/CMakeFiles/convert-llama2c-to-ggml.dir/convert-llama2c-to-ggml.cpp.o
[ 78%] Linking CXX executable ../../bin/convert-llama2c-to-ggml
[ 78%] Built target convert-llama2c-to-ggml
[ 80%] Building CXX object examples/simple/CMakeFiles/simple.dir/simple.cpp.o
[ 82%] Linking CXX executable ../../bin/simple
[ 82%] Built target simple
[ 84%] Building CXX object examples/embd-input/CMakeFiles/embdinput.dir/embd-input-lib.cpp.o
[ 85%] Linking CXX static library libembdinput.a
[ 85%] Built target embdinput
[ 87%] Building CXX object examples/embd-input/CMakeFiles/embd-input-test.dir/embd-input-test.cpp.o
[ 89%] Linking CXX executable ../../bin/embd-input-test
[ 89%] Built target embd-input-test
[ 91%] Building CXX object examples/server/CMakeFiles/server.dir/server.cpp.o
[ 92%] Linking CXX executable ../../bin/server
[ 92%] Built target server
[ 94%] Building CXX object pocs/vdot/CMakeFiles/vdot.dir/vdot.cpp.o
[ 96%] Linking CXX executable ../../bin/vdot
[ 96%] Built target vdot
[ 98%] Building CXX object pocs/vdot/CMakeFiles/q8dot.dir/q8dot.cpp.o
[100%] Linking CXX executable ../../bin/q8dot
[100%] Built target q8dot
For those in future who may stumble across this. So the files are built within the bin path.
So after build/bin/, if you built it with no problems, then you'll see all the files required as if you ran make command in the llama.cpp path.
you can run CUDA_VISIBLE_DEVICES=0 ./bin/server -m </path/to/model>/llama-2-7b-chat.ggmlv3.q8_0.bin -ngl 35 to start the server but essentially all the files are in the bin path.
Good luck.
Prerequisites
Hi when running make LLama_cublas=1,
Expected Behavior
When running make on an aws instance, i can build the llama.cpp package, however when trying to add gpu support and build with make llama_cublas=1, i get the following error.
Current Behavior
The files ggml-cuda.cu error on the lines output below.
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
sh-4.2$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Stepping: 7 CPU MHz: 3140.118 BogoMIPS: 4999.99 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 36608K NUMA node0 CPU(s): 0-7 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni
$ uname -a
Failure Information (for bugs)
i tried with Cuda versions 11.2 - 11.8
Steps to Reproduce
Step 1. To reproduce boot up a ml.g4dn.2xlarge. Step 2. clone the LLama cpp package Step 3. run make llama_cublas=1
Failure Logs