ggerganov / llama.cpp

LLM inference in C/C++
MIT License
62.59k stars 8.98k forks source link

I'm looking at this example that calls for ./quantize and I don't see a quantize folder #4845

Closed MotorCityCobra closed 6 months ago

MotorCityCobra commented 6 months ago

I'm using Windows 10 and I have built llama.cpp with cmake without error.
I have built the .gguf file out of the weights for the Mistral 7B dolphin model also without error.

Under the heading 'Prepare Data & Run'

quantize the model to 4-bits (using q4_0 method)

./quantize ./models/7B/ggml-model-f16.gguf ./models/7B/ggml-model-q4_0.gguf q4_0

But there is no quantize directory

There is a 'quantize.exe' in the in the build/bin/Release file, but I don't know how this is used or if it is even for the purpose of quantizing a .guff file.

I tried running a lot of commands like these below,

(mllm) C:\Users\ooo\tor\llama.cpp>./quantize ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
'.' is not recognized as an internal or external command,
operable program or batch file.

(mllm) C:\Users\ooo\tor\llama.cpp>build/bin/Release/quantize/ ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
'build' is not recognized as an internal or external command,
operable program or batch file.

(mllm) C:\Users\ooo\tor\llama.cpp>./build/bin/Release/quantize/ ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
'.' is not recognized as an internal or external command,
operable program or batch file.

(mllm) C:\Users\ooo\tor\llama.cpp>./build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
'.' is not recognized as an internal or external command,
operable program or batch file.

(mllm) C:\Users\ooo\tor\llama.cpp>/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
The system cannot find the path specified.

(mllm) C:\Users\ooo\tor\llama.cpp>dir
 Volume in drive C has no label.
 Volume Serial Number is 489C-B845

 Directory of C:\Users\ooo\tor\llama.cpp

01/09/2024  01:11 PM    <DIR>          .
01/09/2024  01:11 PM    <DIR>          ..
01/08/2024  07:14 PM               774 .clang-tidy
01/08/2024  07:14 PM    <DIR>          .devops
01/08/2024  07:14 PM               167 .dockerignore
01/08/2024  07:14 PM                51 .ecrc
01/08/2024  07:14 PM               529 .editorconfig
01/08/2024  07:14 PM                33 .flake8
01/08/2024  07:14 PM    <DIR>          .github
01/08/2024  07:14 PM             1,336 .gitignore
01/08/2024  07:14 PM               413 .pre-commit-config.yaml
01/09/2024  01:11 PM    <DIR>          .vs
01/08/2024  07:14 PM    <DIR>          awq-py
01/09/2024  01:13 PM    <DIR>          build
01/08/2024  07:14 PM             6,425 build.zig
01/08/2024  07:14 PM    <DIR>          ci
01/08/2024  07:14 PM    <DIR>          cmake
01/08/2024  07:14 PM            34,734 CMakeLists.txt
01/08/2024  07:14 PM               224 codecov.yml
01/09/2024  01:13 PM    <DIR>          common
01/08/2024  07:14 PM            55,972 convert-hf-to-gguf.py
01/08/2024  07:14 PM            19,292 convert-llama-ggml-to-gguf.py
01/08/2024  07:14 PM             5,186 convert-lora-to-ggml.py
01/08/2024  07:14 PM             4,996 convert-persimmon-to-gguf.py
01/08/2024  07:14 PM            53,211 convert.py
01/08/2024  07:14 PM    <DIR>          docs
01/08/2024  07:14 PM    <DIR>          examples
01/08/2024  07:14 PM             1,646 flake.lock
01/08/2024  07:14 PM             5,625 flake.nix
01/08/2024  07:14 PM            29,384 ggml-alloc.c
01/08/2024  07:14 PM             3,942 ggml-alloc.h
01/08/2024  07:14 PM             5,019 ggml-backend-impl.h
01/08/2024  07:14 PM            53,940 ggml-backend.c
01/08/2024  07:14 PM             9,177 ggml-backend.h
01/08/2024  07:14 PM           400,244 ggml-cuda.cu
01/08/2024  07:14 PM             2,569 ggml-cuda.h
01/08/2024  07:14 PM             7,757 ggml-impl.h
01/08/2024  07:14 PM             4,449 ggml-metal.h
01/08/2024  07:14 PM           151,946 ggml-metal.m
01/08/2024  07:14 PM           199,666 ggml-metal.metal
01/08/2024  07:14 PM             7,135 ggml-mpi.c
01/08/2024  07:14 PM               950 ggml-mpi.h
01/08/2024  07:14 PM            72,884 ggml-opencl.cpp
01/08/2024  07:14 PM               951 ggml-opencl.h
01/08/2024  07:14 PM           297,845 ggml-quants.c
01/08/2024  07:14 PM            11,218 ggml-quants.h
01/08/2024  07:14 PM           675,589 ggml.c
01/08/2024  07:14 PM            85,276 ggml.h
01/08/2024  07:14 PM    <DIR>          gguf-py
01/08/2024  07:14 PM    <DIR>          grammars
01/08/2024  07:14 PM             1,093 LICENSE
01/08/2024  07:14 PM           438,385 llama.cpp
01/08/2024  07:14 PM            40,382 llama.h
01/08/2024  07:14 PM            26,229 Makefile
01/08/2024  07:14 PM    <DIR>          media
01/09/2024  04:40 PM    <DIR>          models
01/08/2024  07:14 PM               145 mypy.ini
01/09/2024  01:10 PM    <DIR>          out
01/08/2024  07:14 PM             1,495 Package.swift
01/08/2024  07:14 PM    <DIR>          pocs
01/08/2024  07:14 PM    <DIR>          prompts
01/08/2024  07:14 PM            55,242 README.md
01/08/2024  07:14 PM    <DIR>          requirements
01/08/2024  07:14 PM               504 requirements.txt
01/08/2024  07:14 PM             5,453 run_with_preset.py
01/08/2024  07:14 PM    <DIR>          scripts
01/08/2024  07:14 PM             3,869 SHA256SUMS
01/08/2024  07:14 PM    <DIR>          spm-headers
01/08/2024  07:14 PM    <DIR>          tests
01/08/2024  07:14 PM            47,547 unicode.h
              47 File(s)      2,830,899 bytes
              23 Dir(s)  96,381,542,400 bytes free

(mllm) C:\Users\ooo\tor\llama.cpp>/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
The system cannot find the path specified.

(mllm) C:\Users\ooo\tor\llama.cpp>llama.cpp/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0

(mllm) C:\Users\ooo\tor\llama.cpp>python llama.cpp/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
python: can't open file 'C:\\Users\\ooo\\tor\\llama.cpp\\llama.cpp\\build\\bin\\Release\\quantize.exe': [Errno 2] No such file or directory

(mllm) C:\Users\ooo\tor\llama.cpp>python /build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
python: can't open file 'C:\\build\\bin\\Release\\quantize.exe': [Errno 2] No such file or directory

(mllm) C:\Users\ooo\tor\llama.cpp>python build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
SyntaxError: Non-UTF-8 code starting with '\x90' in file C:\Users\ooo\tor\llama.cpp\build\bin\Release\quantize.exe on line 2, but no encoding declared; see https://python.org/dev/peps/pep-0263/ for details

(mllm) C:\Users\ooo\tor\llama.cpp>llama.cpp/build/bin/Release/quantize.exe ./models/dolph/ggml-model-f16.gguf ./models/dolph/ggml-model-q5_0.gguf q5_0
MotorCityCobra commented 6 months ago

C:\Users\ooo\tor\llama.cpp\build\bin\Release>quantize.exe ../../../models/dolph/ggml-model-f16.gguf ../../../models/dolph/ggml-model-q5_0.gguf q5_0