Open sgehrman opened 1 month ago
Are you building in debug or release or? Can you help narrow down by being precise about which 'previous version' you know for sure didn't exhibit the large sizes you see now?
previous: v2.7.5
builds like this: $ cmake --build . --parallel --config Release
I don't even know how to build a debug version.
Here's my whole build file if this helps. I just added the Cuda stuff, and I couldn't get Vulkan installed on Debian, so I downloaded the Vulkan tar file and put it in my home dir. The last line with 'dart', just copies the built .so files to another directory to be built inside my app.
#!/bin/bash
export CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
export CUDACXX=/usr/local/cuda/bin/nvcc
# https://vulkan.lunarg.com/sdk/home#linux
# https://vulkan.lunarg.com/doc/sdk/1.3.296.0/linux/getting_started.html
export VULKAN_SDK=~/vulkan/1.3.296.0/x86_64
export PATH=$VULKAN_SDK/bin:$PATH
export LD_LIBRARY_PATH=$VULKAN_SDK/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
export VK_LAYER_PATH=$VULKAN_SDK/share/vulkan/explicit_layer.d
pushd "shared_libs/gpt4all/gpt4all-backend"
rm -rf build
mkdir build
cd build
cmake ..
cmake --build . --parallel --config Release
popd
dart './tools/dart_tools/lib/copy_libraries.dart'
?
I have an app I embed the gpt4all libs inside. I build the backend on linux and copy them into my app to load. Previous versions were no so large, but 3.4.2 is super huge. Like 800MB total. Also, they don't work, not sure why. It just uses up the cpu for a while and doesn't return a response. Any idea what I might be doing wrong?
355M Oct 18 22:21 libllamamodel-mainline-cuda-avxonly.so 355M Oct 18 22:21 libllamamodel-mainline-cuda.so 12M Oct 18 22:21 libllamamodel-mainline-kompute-avxonly.so 12M Oct 18 22:21 libllamamodel-mainline-kompute.so 1.7M Oct 18 22:21 libllmodel.so