microsoft / BitNet

Official inference framework for 1-bit LLMs
MIT License
11.2k stars 760 forks source link

The current revision bf11a49 fails to build: Cannot find source file: ../../../../include/bitnet-lut-kernels.h #118

Open yurivict opened 1 day ago

yurivict commented 1 day ago
CMake Error at 3rdparty/llama.cpp/ggml/src/CMakeLists.txt:1324 (add_library):
  Cannot find source file:

    ../../../../include/bitnet-lut-kernels.h

  Tried extensions .c .C .c++ .cc .cpp .cxx .cu .mpp .m .M .mm .ixx .cppm
  .ccm .cxxm .c++m .h .hh .h++ .hm .hpp .hxx .in .txx .f .F .for .f77 .f90
  .f95 .f03 .hip .ispc

CMake Error at 3rdparty/llama.cpp/ggml/src/CMakeLists.txt:1324 (add_library):
  No SOURCES given to target: ggml
noppej commented 1 day ago

I ran into the same problem. I believe it is caused by a bad symbolic link in the llama.cpp fork, at https://github.com/Eddie-Wang1120/llama.cpp/tree/814d0ee5440495255a4e3a5a8abf001b27b539d4/spm-headers.

BTW. That fork is now over 400 commits behind the head of llama.cpp. Is there a documented strategy for keeping pace with llama.cpp development efforts?

eugenehp commented 20 hours ago

We had to roll a patch on top of the custom llama.cpp/ggml used as a submodule in this repo: https://github.com/eugenehp/bitnet-cpp-rs/blob/main/bitnet-cpp-sys/patches/llama.cpp.patch#L11

yurivict commented 17 hours ago

This repository is broken as it is now.

potassiummmm commented 2 hours ago

The header file include/bitnet-lut-kernels.h is generated by utils/codegen_tl*.py and should be generated automatically if setup_env.py is executed correctly. Would it be possible to check if each step was executed successfully and look at the error logs in the logs/ folder?

noppej commented 19 minutes ago

Hi @potassiummmm ... I've setup the environment in a clean Docker container running on my Mac M1. Below is the full output of my terminal session that shows me following the steps from the README, as well as the failing cmake with the contents of the logs/compile.log file. Let me know if you need more ...

root@bitnet-rs-template:/app/bitnet.cpp# git clone --recursive https://github.com/microsoft/BitNet.git
Cloning into 'BitNet'...
remote: Enumerating objects: 124, done.
remote: Counting objects: 100% (121/121), done.
remote: Compressing objects: 100% (79/79), done.
remote: Total 124 (delta 52), reused 85 (delta 36), pack-reused 3 (from 1)
Receiving objects: 100% (124/124), 1.88 MiB | 4.64 MiB/s, done.
Resolving deltas: 100% (52/52), done.
Submodule '3rdparty/llama.cpp' (https://github.com/Eddie-Wang1120/llama.cpp.git) registered for path '3rdparty/llama.cpp'
Cloning into '/app/bitnet.cpp/BitNet/3rdparty/llama.cpp'...
remote: Enumerating objects: 25578, done.        
remote: Counting objects: 100% (5117/5117), done.        
remote: Compressing objects: 100% (236/236), done.        
remote: Total 25578 (delta 5017), reused 4881 (delta 4881), pack-reused 20461 (from 1)        
Receiving objects: 100% (25578/25578), 54.32 MiB | 16.82 MiB/s, done.
Resolving deltas: 100% (18475/18475), done.
Submodule path '3rdparty/llama.cpp': checked out '814d0ee5440495255a4e3a5a8abf001b27b539d4'
Submodule 'kompute' (https://github.com/nomic-ai/kompute.git) registered for path '3rdparty/llama.cpp/ggml/src/kompute'
Cloning into '/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/kompute'...
remote: Enumerating objects: 9118, done.        
remote: Counting objects: 100% (253/253), done.        
remote: Compressing objects: 100% (148/148), done.        
remote: Total 9118 (delta 119), reused 184 (delta 95), pack-reused 8865 (from 1)        
Receiving objects: 100% (9118/9118), 17.59 MiB | 16.94 MiB/s, done.
Resolving deltas: 100% (5726/5726), done.
Submodule path '3rdparty/llama.cpp/ggml/src/kompute': checked out '4565194ed7c32d1d2efa32ceab4d3c6cae006306'
root@bitnet-rs-template:/app/bitnet.cpp# cd BitNet
root@bitnet-rs-template:/app/bitnet.cpp/BitNet# conda create -n bitnet-cpp python=3.9
Channels:
 - defaults
Platform: linux-aarch64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/bitnet-cpp

  added / updated specs:
    - python=3.9

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    pip-24.2                   |   py39hd43f75c_0         2.2 MB
    python-3.9.20              |       h4bb2201_1        24.7 MB
    setuptools-75.1.0          |   py39hd43f75c_0         1.6 MB
    wheel-0.44.0               |   py39hd43f75c_0         111 KB
    ------------------------------------------------------------
                                           Total:        28.6 MB

The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-aarch64::_libgcc_mutex-0.1-main 
  _openmp_mutex      pkgs/main/linux-aarch64::_openmp_mutex-5.1-51_gnu 
  ca-certificates    pkgs/main/linux-aarch64::ca-certificates-2024.9.24-hd43f75c_0 
  ld_impl_linux-aar~ pkgs/main/linux-aarch64::ld_impl_linux-aarch64-2.40-h48e3ba3_0 
  libffi             pkgs/main/linux-aarch64::libffi-3.4.4-h419075a_1 
  libgcc-ng          pkgs/main/linux-aarch64::libgcc-ng-11.2.0-h1234567_1 
  libgomp            pkgs/main/linux-aarch64::libgomp-11.2.0-h1234567_1 
  libstdcxx-ng       pkgs/main/linux-aarch64::libstdcxx-ng-11.2.0-h1234567_1 
  ncurses            pkgs/main/linux-aarch64::ncurses-6.4-h419075a_0 
  openssl            pkgs/main/linux-aarch64::openssl-3.0.15-h998d150_0 
  pip                pkgs/main/linux-aarch64::pip-24.2-py39hd43f75c_0 
  python             pkgs/main/linux-aarch64::python-3.9.20-h4bb2201_1 
  readline           pkgs/main/linux-aarch64::readline-8.2-h998d150_0 
  setuptools         pkgs/main/linux-aarch64::setuptools-75.1.0-py39hd43f75c_0 
  sqlite             pkgs/main/linux-aarch64::sqlite-3.45.3-h998d150_0 
  tk                 pkgs/main/linux-aarch64::tk-8.6.14-h987d8db_0 
  tzdata             pkgs/main/noarch::tzdata-2024b-h04d1e81_0 
  wheel              pkgs/main/linux-aarch64::wheel-0.44.0-py39hd43f75c_0 
  xz                 pkgs/main/linux-aarch64::xz-5.4.6-h998d150_1 
  zlib               pkgs/main/linux-aarch64::zlib-1.2.13-h998d150_1 

Proceed ([y]/n)? y

Downloading and Extracting Packages:

Preparing transaction: done                                                                                              
Verifying transaction: done                                                                                              
Executing transaction: done                                                                                              
#
# To activate this environment, use
#
#     $ conda activate bitnet-cpp
#
# To deactivate an active environment, use
#
#     $ conda deactivate

root@bitnet-rs-template:/app/bitnet.cpp/BitNet# conda activate bitnet-cpp

CondaError: Run 'conda init' before 'conda activate'

root@bitnet-rs-template:/app/bitnet.cpp/BitNet# conda init
no change     /opt/conda/condabin/conda
no change     /opt/conda/bin/conda
no change     /opt/conda/bin/conda-env
no change     /opt/conda/bin/activate
no change     /opt/conda/bin/deactivate
no change     /opt/conda/etc/profile.d/conda.sh
no change     /opt/conda/etc/fish/conf.d/conda.fish
no change     /opt/conda/shell/condabin/Conda.psm1
no change     /opt/conda/shell/condabin/conda-hook.ps1
no change     /opt/conda/lib/python3.11/site-packages/xontrib/conda.xsh
no change     /opt/conda/etc/profile.d/conda.csh
modified      /root/.bashrc

==> For changes to take effect, close and re-open your current shell. <==

root@bitnet-rs-template:/app/bitnet.cpp/BitNet# . "$(conda info --base)/etc/profile.d/conda.sh"
root@bitnet-rs-template:/app/bitnet.cpp/BitNet# conda activate bitnet-cpp
(bitnet-cpp) root@bitnet-rs-template:/app/bitnet.cpp/BitNet# pip install -r requirements.txt
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/cpu
Collecting numpy~=1.26.4 (from -r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 1))
  Downloading numpy-1.26.4-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (62 kB)
Collecting sentencepiece~=0.2.0 (from -r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 2))
  Downloading sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (7.7 kB)
Collecting transformers<5.0.0,>=4.45.1 (from -r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading transformers-4.46.2-py3-none-any.whl.metadata (44 kB)
Collecting gguf>=0.1.0 (from -r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 4))
  Downloading gguf-0.10.0-py3-none-any.whl.metadata (3.5 kB)
Collecting protobuf<5.0.0,>=4.21.0 (from -r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 5))
  Downloading protobuf-4.25.5-cp37-abi3-manylinux2014_aarch64.whl.metadata (541 bytes)
Collecting torch~=2.2.1 (from -r 3rdparty/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt (line 3))
  Downloading https://download.pytorch.org/whl/cpu/torch-2.2.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (86.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 86.6/86.6 MB 27.8 MB/s eta 0:00:00
Collecting filelock (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading filelock-3.16.1-py3-none-any.whl.metadata (2.9 kB)
Collecting huggingface-hub<1.0,>=0.23.2 (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading huggingface_hub-0.26.2-py3-none-any.whl.metadata (13 kB)
Collecting packaging>=20.0 (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading packaging-24.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pyyaml>=5.1 (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading PyYAML-6.0.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (2.1 kB)
Collecting regex!=2019.12.17 (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading regex-2024.11.6-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (40 kB)
Collecting requests (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting safetensors>=0.4.1 (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading safetensors-0.4.5-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (3.8 kB)
Collecting tokenizers<0.21,>=0.20 (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading tokenizers-0.20.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (6.7 kB)
Collecting tqdm>=4.27 (from transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading tqdm-4.67.0-py3-none-any.whl.metadata (57 kB)
Collecting typing-extensions>=4.8.0 (from torch~=2.2.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt (line 3))
  Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting sympy (from torch~=2.2.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt (line 3))
  Downloading sympy-1.13.3-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch~=2.2.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt (line 3))
  Downloading https://download.pytorch.org/whl/networkx-3.2.1-py3-none-any.whl (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 26.0 MB/s eta 0:00:00
Collecting jinja2 (from torch~=2.2.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt (line 3))
  Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting fsspec (from torch~=2.2.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt (line 3))
  Downloading fsspec-2024.10.0-py3-none-any.whl.metadata (11 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch~=2.2.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt (line 3))
  Downloading MarkupSafe-3.0.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (4.0 kB)
Collecting charset-normalizer<4,>=2 (from requests->transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading charset_normalizer-3.4.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (34 kB)
Collecting idna<4,>=2.5 (from requests->transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.1 (from requests->transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading urllib3-2.2.3-py3-none-any.whl.metadata (6.5 kB)
Collecting certifi>=2017.4.17 (from requests->transformers<5.0.0,>=4.45.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_legacy_llama.txt (line 3))
  Downloading certifi-2024.8.30-py3-none-any.whl.metadata (2.2 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy->torch~=2.2.1->-r 3rdparty/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt (line 3))
  Downloading https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 21.3 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (14.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.2/14.2 MB 25.1 MB/s eta 0:00:00
Downloading sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 28.7 MB/s eta 0:00:00
Downloading transformers-4.46.2-py3-none-any.whl (10.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.0/10.0 MB 28.4 MB/s eta 0:00:00
Downloading gguf-0.10.0-py3-none-any.whl (71 kB)
Downloading protobuf-4.25.5-cp37-abi3-manylinux2014_aarch64.whl (293 kB)
Downloading huggingface_hub-0.26.2-py3-none-any.whl (447 kB)
Downloading fsspec-2024.10.0-py3-none-any.whl (179 kB)
Downloading packaging-24.2-py3-none-any.whl (65 kB)
Downloading PyYAML-6.0.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (720 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 720.9/720.9 kB 24.7 MB/s eta 0:00:00
Downloading regex-2024.11.6-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (782 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 782.0/782.0 kB 22.6 MB/s eta 0:00:00
Downloading safetensors-0.4.5-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (442 kB)
Downloading tokenizers-0.20.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.9/2.9 MB 23.6 MB/s eta 0:00:00
Downloading tqdm-4.67.0-py3-none-any.whl (78 kB)
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading filelock-3.16.1-py3-none-any.whl (16 kB)
Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
Downloading requests-2.32.3-py3-none-any.whl (64 kB)
Downloading sympy-1.13.3-py3-none-any.whl (6.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.2/6.2 MB 29.0 MB/s eta 0:00:00
Downloading certifi-2024.8.30-py3-none-any.whl (167 kB)
Downloading charset_normalizer-3.4.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (140 kB)
Downloading idna-3.10-py3-none-any.whl (70 kB)
Downloading MarkupSafe-3.0.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (21 kB)
Downloading urllib3-2.2.3-py3-none-any.whl (126 kB)
Installing collected packages: sentencepiece, mpmath, urllib3, typing-extensions, tqdm, sympy, safetensors, regex, pyyaml, protobuf, packaging, numpy, networkx, MarkupSafe, idna, fsspec, filelock, charset-normalizer, certifi, requests, jinja2, gguf, torch, huggingface-hub, tokenizers, transformers
Successfully installed MarkupSafe-3.0.2 certifi-2024.8.30 charset-normalizer-3.4.0 filelock-3.16.1 fsspec-2024.10.0 gguf-0.10.0 huggingface-hub-0.26.2 idna-3.10 jinja2-3.1.4 mpmath-1.3.0 networkx-3.2.1 numpy-1.26.4 packaging-24.2 protobuf-4.25.5 pyyaml-6.0.2 regex-2024.11.6 requests-2.32.3 safetensors-0.4.5 sentencepiece-0.2.0 sympy-1.13.3 tokenizers-0.20.3 torch-2.2.2 tqdm-4.67.0 transformers-4.46.2 typing-extensions-4.12.2 urllib3-2.2.3
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
(bitnet-cpp) root@bitnet-rs-template:/app/bitnet.cpp/BitNet# python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s
INFO:root:Compiling the code using CMake.
ERROR:root:Error occurred while running command: Command '['cmake', '--build', 'build', '--config', 'Release']' returned non-zero exit status 2., check details in logs/compile.log
(bitnet-cpp) root@bitnet-rs-template:/app/bitnet.cpp/BitNet# cat logs/compile.log
[  1%] Building C object 3rdparty/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml.c.o
cc1: warning: command-line option ‘-fpermissive’ is valid for C++/ObjC++ but not for C
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12514:6: warning: no previous prototype for ‘float_act_quant’ [-Wmissing-prototypes]
12514 | void float_act_quant(const int K, float* B, int32_t* dst, float* act_scale) {
      |      ^~~~~~~~~~~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12530:6: warning: no previous prototype for ‘weight_quant_f32’ [-Wmissing-prototypes]
12530 | void weight_quant_f32(const int M, const int K, float* A, int32_t* dst, float* i2_scale) {
      |      ^~~~~~~~~~~~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c: In function ‘weight_quant_f32’:
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12545:35: warning: implicit conversion from ‘float’ to ‘double’ to match other operand of binary expression [-Wdouble-promotion]
12545 |             dst[i] = (double)A[i] * i2_scale[0] > 0 ? 1 : -1;
      |                                   ^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c: At top level:
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12550:6: warning: no previous prototype for ‘weight_quant_f16’ [-Wmissing-prototypes]
12550 | void weight_quant_f16(const int M, const int K, uint16_t* A, int32_t* dst, float* i2_scale) {
      |      ^~~~~~~~~~~~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c: In function ‘weight_quant_f16’:
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12566:37: warning: implicit conversion from ‘float’ to ‘double’ to match other operand of binary expression [-Wdouble-promotion]
12566 |             dst[i] = (double)temp_A * i2_scale[0] > 0 ? 1 : -1;
      |                                     ^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c: At top level:
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12571:6: warning: no previous prototype for ‘matrixMultiply_int’ [-Wmissing-prototypes]
12571 | void matrixMultiply_int(const int M, const int N, const int K, const int32_t* A, const int32_t* B, int32_t* C) {
      |      ^~~~~~~~~~~~~~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c: In function ‘ggml_compute_forward_mul_mat’:
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12660:44: warning: initialization of ‘bitnet_float_type *’ {aka ‘float *’} from incompatible pointer type ‘char *’ [-Wincompatible-pointer-types]
12660 |         bitnet_float_type * bitnet_f_ptr = wdata;
      |                                            ^~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12664:25: warning: pointer targets in initialization of ‘int8_t *’ {aka ‘signed char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
12664 |         int8_t * qlut = cur_wdata;
      |                         ^~~~~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12675:42: warning: passing argument 1 of ‘ggml_bitnet_transform_tensor’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
12675 |             ggml_bitnet_transform_tensor(src0);
      |                                          ^~~~
In file included from /app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:50:
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/../../../../include/ggml-bitnet.h:35:65: note: expected ‘struct ggml_tensor *’ but argument is of type ‘const struct ggml_tensor *’
   35 | GGML_API void ggml_bitnet_transform_tensor(struct ggml_tensor * tensor);
      |                                            ~~~~~~~~~~~~~~~~~~~~~^~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12679:51: warning: passing argument 2 of ‘ggml_fp32_to_fp16_row’ from incompatible pointer type [-Wincompatible-pointer-types]
12679 |                 ggml_fp32_to_fp16_row(src1->data, bitnet_f_ptr, ne10 * ne11);
      |                                                   ^~~~~~~~~~~~
      |                                                   |
      |                                                   bitnet_float_type * {aka float *}
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:541:59: note: expected ‘ggml_fp16_t *’ {aka ‘short unsigned int *’} but argument is of type ‘bitnet_float_type *’ {aka ‘float *’}
  541 | void ggml_fp32_to_fp16_row(const float * x, ggml_fp16_t * y, int64_t n) {
      |                                             ~~~~~~~~~~~~~~^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12722:50: warning: passing argument 1 of ‘ggml_fp16_to_fp32_row’ from incompatible pointer type [-Wincompatible-pointer-types]
12722 |                 ggml_fp16_to_fp32_row(act_output + dst_offset, (float *) dst->data + dst_offset, ne01 / n_tile_num);
      |                                       ~~~~~~~~~~~^~~~~~~~~~~~
      |                                                  |
      |                                                  bitnet_float_type * {aka float *}
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:535:48: note: expected ‘const ggml_fp16_t *’ {aka ‘const short unsigned int *’} but argument is of type ‘bitnet_float_type *’ {aka ‘float *’}
  535 | void ggml_fp16_to_fp32_row(const ggml_fp16_t * x, float * y, int64_t n) {
      |                            ~~~~~~~~~~~~~~~~~~~~^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12712:23: warning: unused variable ‘qlut_offset’ [-Wunused-variable]
12712 |             const int qlut_offset       = 0;
      |                       ^~~~~~~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12702:19: warning: unused variable ‘lut_tile_size’ [-Wunused-variable]
12702 |         const int lut_tile_size    = lut_size / n_tile_num;
      |                   ^~~~~~~~~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12666:29: warning: unused variable ‘lut_biases’ [-Wunused-variable]
12666 |         bitnet_float_type * lut_biases = (bitnet_float_type *) (lut_scales + wt->lut_scales_size * ne11);
      |                             ^~~~~~~~~~
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/ggml.c:12653:19: warning: unused variable ‘bits’ [-Wunused-variable]
12653 |         const int bits = ggml_bitnet_get_type_bits(type);
      |                   ^~~~
[  2%] Building C object 3rdparty/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-alloc.c.o
cc1: warning: command-line option ‘-fpermissive’ is valid for C++/ObjC++ but not for C
[  3%] Building CXX object 3rdparty/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-backend.cpp.o
[  4%] Building C object 3rdparty/llama.cpp/ggml/src/CMakeFiles/ggml.dir/ggml-quants.c.o
cc1: warning: command-line option ‘-fpermissive’ is valid for C++/ObjC++ but not for C
[  5%] Building CXX object 3rdparty/llama.cpp/ggml/src/CMakeFiles/ggml.dir/__/__/__/__/src/ggml-bitnet-mad.cpp.o
In file included from /app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/./ggml-quants.h:4,
                 from /app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:5:
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/./ggml-common.h:154:16: warning: ISO C++ prohibits anonymous structs [-Wpedantic]
  154 |         struct {
      |                ^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/./ggml-common.h:175:16: warning: ISO C++ prohibits anonymous structs [-Wpedantic]
  175 |         struct {
      |                ^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/./ggml-common.h:196:16: warning: ISO C++ prohibits anonymous structs [-Wpedantic]
  196 |         struct {
      |                ^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/./ggml-common.h:261:16: warning: ISO C++ prohibits anonymous structs [-Wpedantic]
  261 |         struct {
      |                ^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/./ggml-common.h:288:16: warning: ISO C++ prohibits anonymous structs [-Wpedantic]
  288 |         struct {
      |                ^
/app/bitnet.cpp/BitNet/3rdparty/llama.cpp/ggml/src/./ggml-common.h:305:16: warning: ISO C++ prohibits anonymous structs [-Wpedantic]
  305 |         struct {
      |                ^
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp: In function ‘size_t quantize_i2_s(const float*, void*, int64_t, int64_t, const float*)’:
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:46:100: warning: unused parameter ‘quant_weights’ [-Wunused-parameter]
   46 | size_t quantize_i2_s(const float * src, void * dst, int64_t nrow, int64_t n_per_row, const float * quant_weights) {
      |                                                                                      ~~~~~~~~~~~~~~^~~~~~~~~~~~~
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp: In function ‘void ggml_vec_dot_i2_i8_s(int, float*, size_t, const void*, size_t, const void*, size_t, int)’:
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:95:28: warning: cast from type ‘const void*’ to type ‘uint8_t*’ {aka ‘unsigned char*’} casts away qualifiers [-Wcast-qual]
   95 |     const uint8_t *    x = (uint8_t *)vx;
      |                            ^~~~~~~~~~~~~
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:96:28: warning: cast from type ‘const void*’ to type ‘int8_t*’ {aka ‘signed char*’} casts away qualifiers [-Wcast-qual]
   96 |     const int8_t  *    y = (int8_t *)vy;
      |                            ^~~~~~~~~~~~
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:94:52: warning: unused parameter ‘bs’ [-Wunused-parameter]
   94 | void ggml_vec_dot_i2_i8_s(int n, float * s, size_t bs, const void * vx, size_t bx, const void * vy, size_t by, int nrc) {
      |                                             ~~~~~~~^~
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:94:80: warning: unused parameter ‘bx’ [-Wunused-parameter]
   94 | void ggml_vec_dot_i2_i8_s(int n, float * s, size_t bs, const void * vx, size_t bx, const void * vy, size_t by, int nrc) {
      |                                                                         ~~~~~~~^~
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:94:108: warning: unused parameter ‘by’ [-Wunused-parameter]
   94 | void ggml_vec_dot_i2_i8_s(int n, float * s, size_t bs, const void * vx, size_t bx, const void * vy, size_t by, int nrc) {
      |                                                                                                     ~~~~~~~^~
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:94:116: warning: unused parameter ‘nrc’ [-Wunused-parameter]
   94 | void ggml_vec_dot_i2_i8_s(int n, float * s, size_t bs, const void * vx, size_t bx, const void * vy, size_t by, int nrc) {
      |                                                                                                                ~~~~^~~
during GIMPLE pass: vect
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp: In function ‘size_t quantize_i2_s(const float*, void*, int64_t, int64_t, const float*)’:
/app/bitnet.cpp/BitNet/src/ggml-bitnet-mad.cpp:46:8: internal compiler error: in vect_transform_reduction, at tree-vect-loop.cc:7457
   46 | size_t quantize_i2_s(const float * src, void * dst, int64_t nrow, int64_t n_per_row, const float * quant_weights) {
      |        ^~~~~~~~~~~~~
0x18e9327 internal_error(char const*, ...)
        ???:0
0x6a6c8f fancy_abort(char const*, int, char const*)
        ???:0
0xf6c43b vect_transform_reduction(_loop_vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, gimple**, _slp_tree*)
        ???:0
0x18aedcf vect_transform_stmt(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*)
        ???:0
0xf79bcf vect_transform_loop(_loop_vec_info*, gimple*)
        ???:0
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <file:///usr/share/doc/gcc-12/README.Bugs> for instructions.
gmake[2]: *** [3rdparty/llama.cpp/ggml/src/CMakeFiles/ggml.dir/build.make:132: 3rdparty/llama.cpp/ggml/src/CMakeFiles/ggml.dir/__/__/__/__/src/ggml-bitnet-mad.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:759: 3rdparty/llama.cpp/ggml/src/CMakeFiles/ggml.dir/all] Error 2
gmake: *** [Makefile:136: all] Error 2
(bitnet-cpp) root@bitnet-rs-template:/app/bitnet.cpp/BitNet# 
noppej commented 13 minutes ago

@potassiummmm ... the output from generate_build_files.log seems like it might be relevant. PS. Thanks for your help.

# cat generate_build_files.log
-- The C compiler identification is GNU 12.2.0
-- The CXX compiler identification is GNU 12.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found Git: /usr/bin/git (found version "2.39.5") 
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- OpenMP found
-- Using llamafile
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: aarch64
-- ARM detected
-- Performing Test COMPILER_SUPPORTS_FP16_FORMAT_I3E
-- Performing Test COMPILER_SUPPORTS_FP16_FORMAT_I3E - Failed
-- Configuring done
-- Generating done
-- Build files have been written to: /app/bitnet.cpp/BitNet/build