./llama-cli -m models/ggml-model-i2_s.gguf >> CORE DUMPED

@danilopau

Do not use the bin file downloaded from the llama.cpp official website. Instead, run the command

python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s

which will automatically compile an appropriate bin file. Use this compiled file for execution.

Here's the English translation with proper formatting:

Note: When running this command for compilation, the following requirements must be met, especially for clang:

python>=3.9
cmake>=3.22
clang>=18

For Windows users: Install Visual Studio 2022. In the installer, make sure to select at least the following options (this will automatically install required additional tools like CMake):

Desktop-development with C++
C++-CMake Tools for Windows
Git for Windows
C++-Clang Compiler for Windows
MS-Build Support for LLVM-Toolset (clang)

For Debian/Ubuntu users: You can install using the automatic installation script:

bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"

Additional requirement:

conda (highly recommended)

microsoft / BitNet

./llama-cli -m models/ggml-model-i2_s.gguf >> CORE DUMPED #55