ztxz16 / fastllm

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
Apache License 2.0
3.28k stars 333 forks source link

make 编译时出错 #274

Open suncheng-s opened 1 year ago

suncheng-s commented 1 year ago
  1. 在 nvidia/cuda:11.6.2-devel-ubuntu20.04 镜像上测试;
  2. 显卡 A30,CPU Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz
  3. python3.8 3.9 gcc/g++ 8 9 10 都是同样的错误。。使用 cpu 编译也是类似的错误
  4. 编译 nvidia sample 中的例子输出结果正常

Scanning dependencies of target pyfastllm [ 12%] Building CXX object CMakeFiles/pyfastllm.dir/src/models/moss.cpp.o [ 12%] Building CXX object CMakeFiles/pyfastllm.dir/src/executor.cpp.o [ 18%] Building CXX object CMakeFiles/pyfastllm.dir/src/device.cpp.o [ 25%] Building CXX object CMakeFiles/pyfastllm.dir/src/models/chatglm.cpp.o [ 31%] Building CXX object CMakeFiles/pyfastllm.dir/src/devices/cpu/cpudevicebatch.cpp.o [ 37%] Building CXX object CMakeFiles/pyfastllm.dir/src/models/qwen.cpp.o [ 43%] Building CXX object CMakeFiles/pyfastllm.dir/src/models/basellm.cpp.o [ 50%] Building CXX object CMakeFiles/pyfastllm.dir/src/fastllm.cpp.o [ 56%] Building CXX object CMakeFiles/pyfastllm.dir/src/pybinding.cpp.o [ 62%] Building CXX object CMakeFiles/pyfastllm.dir/src/devices/cpu/cpudevice.cpp.o [ 68%] Building CXX object CMakeFiles/pyfastllm.dir/src/models/llama.cpp.o [ 75%] Building CXX object CMakeFiles/pyfastllm.dir/src/model.cpp.o [ 81%] Building CUDA object CMakeFiles/pyfastllm.dir/src/devices/cuda/fastllm-cuda.cu.o [ 87%] Building CXX object CMakeFiles/pyfastllm.dir/src/devices/cuda/cudadevice.cpp.o [ 93%] Building CXX object CMakeFiles/pyfastllm.dir/src/devices/cuda/cudadevicebatch.cpp.o c++: fatal error: Killed signal terminated program cc1plus compilation terminated. make[2]: [CMakeFiles/pyfastllm.dir/build.make:63: CMakeFiles/pyfastllm.dir/src/pybinding.cpp.o] Error 1 make[2]: Waiting for unfinished jobs.... c++: fatal error: Killed signal terminated program cc1plus compilation terminated. make[2]: *** [CMakeFiles/pyfastllm.dir/build.make:115: CMakeFiles/pyfastllm.dir/src/executor.cpp.o] Error 1 ...

suncheng-s commented 1 year ago

free -hl total used free shared buff/cache available Mem: 93Gi 26Gi 39Gi 179Mi 27Gi 66Gi Low: 93Gi 53Gi 39Gi High: 0B 0B 0B Swap: 0B 0B 0B

suncheng-s commented 1 year ago

在 V100 上编译的库放在 A30 上提示 export FASTLLM_PATH=/opt/fastllm/build-py/ export PATH=$PATH:$FASTLLM_PATH export PYTHONPATH=$PYTHONPATH:$FASTLLM_PATH python run.py using GPU idx: 0 (Free memory: 24258 MB) 2023-08-21 13:34:22:2222 | INFO | cuda-11-6-2-devel-ubuntu20-04-f5ff854dc-vd5hj | | | | | 200 | | | loading model from /models/LLM/llm_latest.flm... | Killed

suncheng-s commented 1 year ago

使用 make -j1 编译

Scanning dependencies of target pyfastllm [ 6%] Building CXX object CMakeFiles/pyfastllm.dir/src/pybinding.cpp.o c++: fatal error: Killed signal terminated program cc1plus compilation terminated. make[2]: [CMakeFiles/pyfastllm.dir/build.make:63: CMakeFiles/pyfastllm.dir/src/pybinding.cpp.o] Error 1 make[1]: [CMakeFiles/Makefile2:96: CMakeFiles/pyfastllm.dir/all] Error 2 make: *** [Makefile:84: all] Error 2

suncheng-s commented 1 year ago

编译问题解决:k8s pod 限制了内存

image

取消内存和 cpu 限制后编译成功,cpu 可以正常运行,GPU 报错。修改 native 为 80 86 也会报错。

怀疑是 GPU 限制导致资源不足。 root@cuda-11-6-2-devel-ubuntu20-04-7d5ffdf847-5smb2:/data/workSpace/fastllm/build# ./main -p /data/chatglm2_int8_allin1.flm AVX: ON AVX2: ON AARCH64: OFF Neon FP16: OFF Neon DOT: OFF Load (201 / 201) Warmup... status = 15 1 16 128 Error: cublas error. terminate called after throwing an instance of 'char const*' Aborted (core dumped)