I'm using Windows 10 and I have built llama.cpp with cmake without error.
I have built the .gguf file out of the weights for the Mistral 7B dolphin model also without error.
There is a 'quantize.exe' in the in the build/bin/Release file, but I don't know how this is used or if it is even for the purpose of quantizing a .guff file.
I tried running a lot of commands like these below,
(mllm) C:\Users\ooo\tor\llama.cpp>./quantize ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
'.' is not recognized as an internal or external command,
operable program or batch file.
(mllm) C:\Users\ooo\tor\llama.cpp>build/bin/Release/quantize/ ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
'build' is not recognized as an internal or external command,
operable program or batch file.
(mllm) C:\Users\ooo\tor\llama.cpp>./build/bin/Release/quantize/ ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
'.' is not recognized as an internal or external command,
operable program or batch file.
(mllm) C:\Users\ooo\tor\llama.cpp>./build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
'.' is not recognized as an internal or external command,
operable program or batch file.
(mllm) C:\Users\ooo\tor\llama.cpp>/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
The system cannot find the path specified.
(mllm) C:\Users\ooo\tor\llama.cpp>dir
Volume in drive C has no label.
Volume Serial Number is 489C-B845
Directory of C:\Users\ooo\tor\llama.cpp
01/09/2024 01:11 PM <DIR> .
01/09/2024 01:11 PM <DIR> ..
01/08/2024 07:14 PM 774 .clang-tidy
01/08/2024 07:14 PM <DIR> .devops
01/08/2024 07:14 PM 167 .dockerignore
01/08/2024 07:14 PM 51 .ecrc
01/08/2024 07:14 PM 529 .editorconfig
01/08/2024 07:14 PM 33 .flake8
01/08/2024 07:14 PM <DIR> .github
01/08/2024 07:14 PM 1,336 .gitignore
01/08/2024 07:14 PM 413 .pre-commit-config.yaml
01/09/2024 01:11 PM <DIR> .vs
01/08/2024 07:14 PM <DIR> awq-py
01/09/2024 01:13 PM <DIR> build
01/08/2024 07:14 PM 6,425 build.zig
01/08/2024 07:14 PM <DIR> ci
01/08/2024 07:14 PM <DIR> cmake
01/08/2024 07:14 PM 34,734 CMakeLists.txt
01/08/2024 07:14 PM 224 codecov.yml
01/09/2024 01:13 PM <DIR> common
01/08/2024 07:14 PM 55,972 convert-hf-to-gguf.py
01/08/2024 07:14 PM 19,292 convert-llama-ggml-to-gguf.py
01/08/2024 07:14 PM 5,186 convert-lora-to-ggml.py
01/08/2024 07:14 PM 4,996 convert-persimmon-to-gguf.py
01/08/2024 07:14 PM 53,211 convert.py
01/08/2024 07:14 PM <DIR> docs
01/08/2024 07:14 PM <DIR> examples
01/08/2024 07:14 PM 1,646 flake.lock
01/08/2024 07:14 PM 5,625 flake.nix
01/08/2024 07:14 PM 29,384 ggml-alloc.c
01/08/2024 07:14 PM 3,942 ggml-alloc.h
01/08/2024 07:14 PM 5,019 ggml-backend-impl.h
01/08/2024 07:14 PM 53,940 ggml-backend.c
01/08/2024 07:14 PM 9,177 ggml-backend.h
01/08/2024 07:14 PM 400,244 ggml-cuda.cu
01/08/2024 07:14 PM 2,569 ggml-cuda.h
01/08/2024 07:14 PM 7,757 ggml-impl.h
01/08/2024 07:14 PM 4,449 ggml-metal.h
01/08/2024 07:14 PM 151,946 ggml-metal.m
01/08/2024 07:14 PM 199,666 ggml-metal.metal
01/08/2024 07:14 PM 7,135 ggml-mpi.c
01/08/2024 07:14 PM 950 ggml-mpi.h
01/08/2024 07:14 PM 72,884 ggml-opencl.cpp
01/08/2024 07:14 PM 951 ggml-opencl.h
01/08/2024 07:14 PM 297,845 ggml-quants.c
01/08/2024 07:14 PM 11,218 ggml-quants.h
01/08/2024 07:14 PM 675,589 ggml.c
01/08/2024 07:14 PM 85,276 ggml.h
01/08/2024 07:14 PM <DIR> gguf-py
01/08/2024 07:14 PM <DIR> grammars
01/08/2024 07:14 PM 1,093 LICENSE
01/08/2024 07:14 PM 438,385 llama.cpp
01/08/2024 07:14 PM 40,382 llama.h
01/08/2024 07:14 PM 26,229 Makefile
01/08/2024 07:14 PM <DIR> media
01/09/2024 04:40 PM <DIR> models
01/08/2024 07:14 PM 145 mypy.ini
01/09/2024 01:10 PM <DIR> out
01/08/2024 07:14 PM 1,495 Package.swift
01/08/2024 07:14 PM <DIR> pocs
01/08/2024 07:14 PM <DIR> prompts
01/08/2024 07:14 PM 55,242 README.md
01/08/2024 07:14 PM <DIR> requirements
01/08/2024 07:14 PM 504 requirements.txt
01/08/2024 07:14 PM 5,453 run_with_preset.py
01/08/2024 07:14 PM <DIR> scripts
01/08/2024 07:14 PM 3,869 SHA256SUMS
01/08/2024 07:14 PM <DIR> spm-headers
01/08/2024 07:14 PM <DIR> tests
01/08/2024 07:14 PM 47,547 unicode.h
47 File(s) 2,830,899 bytes
23 Dir(s) 96,381,542,400 bytes free
(mllm) C:\Users\ooo\tor\llama.cpp>/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
The system cannot find the path specified.
(mllm) C:\Users\ooo\tor\llama.cpp>llama.cpp/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
(mllm) C:\Users\ooo\tor\llama.cpp>python llama.cpp/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
python: can't open file 'C:\\Users\\ooo\\tor\\llama.cpp\\llama.cpp\\build\\bin\\Release\\quantize.exe': [Errno 2] No such file or directory
(mllm) C:\Users\ooo\tor\llama.cpp>python /build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
python: can't open file 'C:\\build\\bin\\Release\\quantize.exe': [Errno 2] No such file or directory
(mllm) C:\Users\ooo\tor\llama.cpp>python build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
SyntaxError: Non-UTF-8 code starting with '\x90' in file C:\Users\ooo\tor\llama.cpp\build\bin\Release\quantize.exe on line 2, but no encoding declared; see https://python.org/dev/peps/pep-0263/ for details
(mllm) C:\Users\ooo\tor\llama.cpp>llama.cpp/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
I'm using Windows 10 and I have built llama.cpp with cmake without error.
I have built the .gguf file out of the weights for the Mistral 7B dolphin model also without error.
Under the heading 'Prepare Data & Run'
quantize the model to 4-bits (using q4_0 method)
./quantize ./models/7B/ggml-model-f16.gguf ./models/7B/ggml-model-q4_0.gguf q4_0
But there is no quantize directory
There is a 'quantize.exe' in the in the build/bin/Release file, but I don't know how this is used or if it is even for the purpose of quantizing a .guff file.
I tried running a lot of commands like these below,