Fix compilation w/ metal feature enabled

tc-wolf commented 7 months ago

Link against <hash>-ggml-metal.o file When creating object file with compile_metal, the object file is <some_hash_value>-ggml-metal.o, not ggml-metal.o - looks like cc-rs does this in objects_from_files to avoid overwriting (in general) if files in subdirectories would produce objects with the same hash name.

Since we explicitly link against the ggml-{metal,cuda} etc. object files, have to be more general to find the object files.

Update include path for compile_metal such that specifies directory (not the ggml-metal.h header specifically)

pipertownley commented 7 months ago

I just tried this from your branch and received the following error:

piper@Pipers-MacBook-Pro [21:55:09] [~/Development/ai/rust-llama.cpp/examples/basic] [fix_metal_compilation *]
-> % cargo run
    Updating crates.io index
   Compiling libc v0.2.153
   Compiling proc-macro2 v1.0.78
   Compiling glob v0.3.1
   Compiling unicode-ident v1.0.12
   Compiling prettyplease v0.2.16
   Compiling rustix v0.38.31
   Compiling bitflags v2.4.2
   Compiling cfg-if v1.0.0
   Compiling memchr v2.7.1
   Compiling minimal-lexical v0.2.1
   Compiling regex-syntax v0.8.2
   Compiling libloading v0.8.3
   Compiling bindgen v0.66.1
   Compiling either v1.10.0
   Compiling home v0.5.9
   Compiling nom v7.1.3
   Compiling shlex v1.3.0
   Compiling log v0.4.21
   Compiling clang-sys v1.7.0
   Compiling rustc-hash v1.1.0
   Compiling lazy_static v1.4.0
   Compiling lazycell v1.3.0
   Compiling peeking_take_while v0.1.2
   Compiling cc v1.0.90
   Compiling quote v1.0.35
   Compiling errno v0.3.8
   Compiling regex-automata v0.4.6
   Compiling syn v2.0.52
   Compiling cexpr v0.6.0
   Compiling which v4.4.2
   Compiling regex v1.10.3
   Compiling llama_cpp_rs v0.3.0 (/Users/piper/Development/ai/rust-llama.cpp)
The following warnings were emitted during compilation:

...snip...

error: failed to run custom build command for `llama_cpp_rs v0.3.0 (/Users/piper/Development/ai/rust-llama.cpp)`

Caused by:
  process didn't exit successfully: `/Users/piper/Development/ai/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-4affa90dbb00da17/build-script-build` (exit status: 1)

...snip...
cargo:warning=ar: /Users/piper/Development/ai/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-fb22881edaf8bb0a/out/llama.cpp/ggml.o: No such file or directory

  --- stderr

  error occurred: Command ZERO_AR_DATE="1" "ar" "cq" "/Users/piper/Development/ai/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3624f1876eea0dec/out/libbinding.a" "/Users/piper/Development/ai/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3624f1876eea0dec/out/073db387043af495-common.o" "/Users/piper/Development/ai/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3624f1876eea0dec/out/30b5508d68fcb5a8-llama.o" "/Users/piper/Development/ai/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3624f1876eea0dec/out/8f1a5a601f45df90-binding.o" "/Users/piper/Development/ai/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3624f1876eea0dec/out/llama.cpp/ggml.o" with args "ar" did not execute successfully (status code exit status: 1).

the actual files written in the target look as follows.

 % ls -la target/debug/build/llama_cpp_rs-fb22881edaf8bb0a/out/
total 57112
drwxr-xr-x  14 piper  staff       448 Mar  6 22:28 .
drwxr-xr-x   4 piper  staff       128 Mar  6 22:14 ..
-rw-r--r--   1 piper  staff   2698296 Mar  6 22:28 073db387043af495-common.o
-rw-r--r--   1 piper  staff     41264 Mar  6 22:28 30b5508d68fcb5a8-ggml-alloc.o
-rw-r--r--   1 piper  staff     80840 Mar  6 22:28 30b5508d68fcb5a8-ggml-backend.o
-rw-r--r--   1 piper  staff    144760 Mar  6 22:28 30b5508d68fcb5a8-ggml-quants.o
-rw-r--r--   1 piper  staff    879384 Mar  6 22:28 30b5508d68fcb5a8-ggml.o
-rw-r--r--   1 piper  staff   7908960 Mar  6 22:28 30b5508d68fcb5a8-llama.o
-rw-r--r--   1 piper  staff   2210752 Mar  6 22:28 8f1a5a601f45df90-binding.o
-rw-r--r--   1 piper  staff      3361 Mar  6 22:28 bindings.rs
-rw-r--r--   1 piper  staff    334472 Mar  6 22:28 f11d269d62c14936-ggml-metal.o
-rw-r--r--   1 piper  staff    322743 Mar  6 22:28 ggml-metal.m
-rw-r--r--   1 piper  staff  12818276 Mar  6 22:28 libbinding.a
-rw-r--r--   1 piper  staff   1495784 Mar  6 22:28 libggml.a

tc-wolf commented 6 months ago

Hmm that looks like need to change compile_llama as well (to look for <hash>-ggml.o). Not sure why that works differently for you, on my machine it has

ls target/release/build/llama_cpp_rs-75252caa56296e09/out/*.o
target/release/build/llama_cpp_rs-75252caa56296e09/out/binding.o                      target/release/build/llama_cpp_rs-75252caa56296e09/out/dcc19d765d8f2f3b-ggml-metal.o

and

ls target/release/build/llama_cpp_rs-75252caa56296e09/out/llama.cpp/*.o
target/release/build/llama_cpp_rs-75252caa56296e09/out/llama.cpp/ggml-alloc.o    target/release/build/llama_cpp_rs-75252caa56296e09/out/llama.cpp/ggml-quants.o   target/release/build/llama_cpp_rs-75252caa56296e09/out/llama.cpp/llama.o
target/release/build/llama_cpp_rs-75252caa56296e09/out/llama.cpp/ggml-backend.o  target/release/build/llama_cpp_rs-75252caa56296e09/out/llama.cpp/ggml.o

This feels like an antipattern though, there should be a better way of getting cc-rs to link in the needed object files (is there some linker path we can specify?) without explicitly naming them each individual file.

pipertownley commented 6 months ago

This feels like an antipattern though

I don't disagree, but I've yet to fully unroll how this all works well enough to suggest a better way. I've forked the mainline, and reproduced your changes in my own. I'm rather invested in getting this working and I am on vacation this coming week. I'm going to try to see what I can figure out. 🤞🏻 Wish me luck.

(btw, I also have a linux workstation with a gtx1070ti I can test against. No windows.)

My laptop setup is pretty much stock. Chip: Apple M1 Pro Ram: 16 GB Sonoma 14.3.1 (23D60)

Using stable toolchain, rust and cargo are up to date. Whether or not metal is enabled makes no difference due to ggml.o being an unconditional dependency. However, your branch does fix the problem with the other files.

I'll pass on going through the trouble of redacting and sharing my environment, suffice to say I don't have any compiler or linker/library env vars set messing things up.

rustup check
stable-aarch64-apple-darwin - Up to date : 1.76.0 (07dca489a 2024-02-04)
nightly-aarch64-apple-darwin - Up to date : 1.78.0-nightly (9c3ad802d 2024-03-07)
rustup - Up to date : 1.26.0

rustup toolchain list
stable-aarch64-apple-darwin (default)
nightly-aarch64-apple-darwin

pipertownley commented 6 months ago

A potentially iImportant detail I had overlooked, I can build just fine in the top directory of the repository. I can pass in metal as a feature, and it works well too.

If I change directory to examples/basic and run cargo run it fails. I have this same behavior if I try to use this repo as a cargo dependency in any other project.

tc-wolf commented 6 months ago

That is a good detail to note, I can reproduce that error (running example from examples/basic directory).

I'm using stable-aarch64-apple-darwin as well (1.76.0). I'll make some changes to my MR and see if I can get it to work with/without metal.

gavi commented 6 months ago

any updates? I still seem to get this error

llama.cpp/ggml.o: No such file or directory

pipertownley commented 6 months ago

any updates? I still seem to get this error

llama.cpp/ggml.o: No such file or directory

I don't believe this problem is MacOS specific, or even specific to this PR/branch.

I've cloned the main branch of the upstream repository on Debian 12 and encounter the same issue when building the example or trying to use the package as a cargo dependency.

$ cargo run
   Compiling llama_cpp_rs v0.3.0 (/home/piper/rust-llama/mdrokz/rust-llama.cpp)
The following warnings were emitted during compilation:

warning: llama_cpp_rs@0.3.0: Compiler version doesn't include clang or GCC: "cc" "--version"
warning: llama_cpp_rs@0.3.0: ./llama.cpp/ggml.c:17433:13: warning: ‘ggml_opt_get_grad’ defined but not used [-Wunused-function]
warning: llama_cpp_rs@0.3.0: 17433 | static void ggml_opt_get_grad(int np, struct ggml_tensor * const ps[], float * g) {
warning: llama_cpp_rs@0.3.0:       |             ^~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/ggml-backend.c:836:13: warning: ‘sched_print_assignments’ defined but not used [-Wunused-function]
warning: llama_cpp_rs@0.3.0:   836 | static void sched_print_assignments(ggml_backend_sched_t sched, struct ggml_cgraph * graph) {
warning: llama_cpp_rs@0.3.0:       |             ^~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/ggml-quants.c:1337:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function]
warning: llama_cpp_rs@0.3.0:  1337 | static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
warning: llama_cpp_rs@0.3.0:       |              ^~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: Compiler version doesn't include clang or GCC: "c++" "--version"
warning: llama_cpp_rs@0.3.0: ./binding.cpp: In function ‘int get_embeddings(void*, void*, float*)’:
warning: llama_cpp_rs@0.3.0: ./binding.cpp:81:23: warning: ‘int llama_eval(llama_context*, llama_token*, int32_t, int)’ is deprecated: use llama_decode() instead [-Wdeprecated-declarations]
warning: llama_cpp_rs@0.3.0:    81 |         if (llama_eval(ctx, embd_inp.data(), embd_inp.size(), n_past))
warning: llama_cpp_rs@0.3.0:       |             ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: In file included from ./llama.cpp/common/common.h:5,
warning: llama_cpp_rs@0.3.0:                  from ./binding.cpp:1:
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:532:30: note: declared here
warning: llama_cpp_rs@0.3.0:   532 |     LLAMA_API DEPRECATED(int llama_eval(
warning: llama_cpp_rs@0.3.0:       |                              ^~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
warning: llama_cpp_rs@0.3.0:    31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
warning: llama_cpp_rs@0.3.0:       |                                    ^~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp: In function ‘int eval(void*, void*, char*)’:
warning: llama_cpp_rs@0.3.0: ./binding.cpp:139:22: warning: ‘int llama_eval(llama_context*, llama_token*, int32_t, int)’ is deprecated: use llama_decode() instead [-Wdeprecated-declarations]
warning: llama_cpp_rs@0.3.0:   139 |     return llama_eval(ctx, tokens.data(), n_prompt_tokens, n_past);
warning: llama_cpp_rs@0.3.0:       |            ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:532:30: note: declared here
warning: llama_cpp_rs@0.3.0:   532 |     LLAMA_API DEPRECATED(int llama_eval(
warning: llama_cpp_rs@0.3.0:       |                              ^~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
warning: llama_cpp_rs@0.3.0:    31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
warning: llama_cpp_rs@0.3.0:       |                                    ^~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp: In function ‘int llama_predict(void*, void*, char**, bool)’:
warning: llama_cpp_rs@0.3.0: ./binding.cpp:283:19: warning: ‘int llama_eval(llama_context*, llama_token*, int32_t, int)’ is deprecated: use llama_decode() instead [-Wdeprecated-declarations]
warning: llama_cpp_rs@0.3.0:   283 |         llama_eval(ctx, tmp, 1, 0);
warning: llama_cpp_rs@0.3.0:       |         ~~~~~~~~~~^~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:532:30: note: declared here
warning: llama_cpp_rs@0.3.0:   532 |     LLAMA_API DEPRECATED(int llama_eval(
warning: llama_cpp_rs@0.3.0:       |                              ^~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
warning: llama_cpp_rs@0.3.0:    31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
warning: llama_cpp_rs@0.3.0:       |                                    ^~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp:354:31: warning: ‘int llama_eval(llama_context*, llama_token*, int32_t, int)’ is deprecated: use llama_decode() instead [-Wdeprecated-declarations]
warning: llama_cpp_rs@0.3.0:   354 |                 if (llama_eval(ctx, &embd[i], n_eval, n_past))
warning: llama_cpp_rs@0.3.0:       |                     ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:532:30: note: declared here
warning: llama_cpp_rs@0.3.0:   532 |     LLAMA_API DEPRECATED(int llama_eval(
warning: llama_cpp_rs@0.3.0:       |                              ^~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
warning: llama_cpp_rs@0.3.0:    31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
warning: llama_cpp_rs@0.3.0:       |                                    ^~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp:438:49: warning: ‘void llama_sample_temperature(llama_context*, llama_token_data_array*, float)’ is deprecated: use llama_sample_temp instead [-Wdeprecated-declarations]
warning: llama_cpp_rs@0.3.0:   438 |                         llama_sample_temperature(ctx, &candidates_p, temp);
warning: llama_cpp_rs@0.3.0:       |                         ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:743:31: note: declared here
warning: llama_cpp_rs@0.3.0:   743 |     LLAMA_API DEPRECATED(void llama_sample_temperature(
warning: llama_cpp_rs@0.3.0:       |                               ^~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
warning: llama_cpp_rs@0.3.0:    31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
warning: llama_cpp_rs@0.3.0:       |                                    ^~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp:444:49: warning: ‘void llama_sample_temperature(llama_context*, llama_token_data_array*, float)’ is deprecated: use llama_sample_temp instead [-Wdeprecated-declarations]
warning: llama_cpp_rs@0.3.0:   444 |                         llama_sample_temperature(ctx, &candidates_p, temp);
warning: llama_cpp_rs@0.3.0:       |                         ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:743:31: note: declared here
warning: llama_cpp_rs@0.3.0:   743 |     LLAMA_API DEPRECATED(void llama_sample_temperature(
warning: llama_cpp_rs@0.3.0:       |                               ^~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
warning: llama_cpp_rs@0.3.0:    31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
warning: llama_cpp_rs@0.3.0:       |                                    ^~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp:454:49: warning: ‘void llama_sample_temperature(llama_context*, llama_token_data_array*, float)’ is deprecated: use llama_sample_temp instead [-Wdeprecated-declarations]
warning: llama_cpp_rs@0.3.0:   454 |                         llama_sample_temperature(ctx, &candidates_p, temp);
warning: llama_cpp_rs@0.3.0:       |                         ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:743:31: note: declared here
warning: llama_cpp_rs@0.3.0:   743 |     LLAMA_API DEPRECATED(void llama_sample_temperature(
warning: llama_cpp_rs@0.3.0:       |                               ^~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
warning: llama_cpp_rs@0.3.0:    31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
warning: llama_cpp_rs@0.3.0:       |                                    ^~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp:473:42: warning: cast from type ‘const char*’ to type ‘char*’ casts away qualifiers [-Wcast-qual]
warning: llama_cpp_rs@0.3.0:   473 |             if (!tokenCallback(state_pr, (char*)token_str.c_str()))
warning: llama_cpp_rs@0.3.0:       |                                          ^~~~~~~~~~~~~~~~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp: In function ‘void* llama_allocate_params(const char*, int, int, int, int, float, float, float, int, bool, bool, int, int, const char**, int, float, float, float, float, int, float, float, bool, const char*, const char*, bool, bool, bool, const char*, const char*, bool)’:
warning: llama_cpp_rs@0.3.0: ./binding.cpp:627:100: warning: unused parameter ‘ignore_eos’ [-Wunused-parameter]
warning: llama_cpp_rs@0.3.0:   627 |                             float top_p, float temp, float repeat_penalty, int repeat_last_n, bool ignore_eos, bool memory_f16, int n_batch, int n_keep, const char **antiprompt, int antiprompt_count,
warning: llama_cpp_rs@0.3.0:       |                                                                                               ~~~~~^~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp:627:117: warning: unused parameter ‘memory_f16’ [-Wunused-parameter]
warning: llama_cpp_rs@0.3.0:   627 |                             float top_p, float temp, float repeat_penalty, int repeat_last_n, bool ignore_eos, bool memory_f16, int n_batch, int n_keep, const char **antiprompt, int antiprompt_count,
warning: llama_cpp_rs@0.3.0:       |                                                                                                                ~~~~~^~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp: In function ‘void* load_model(const char*, int, int, bool, bool, bool, bool, bool, bool, int, int, const char*, const char*, bool)’:
warning: llama_cpp_rs@0.3.0: ./binding.cpp:707:65: warning: unused parameter ‘memory_f16’ [-Wunused-parameter]
warning: llama_cpp_rs@0.3.0:   707 | void *load_model(const char *fname, int n_ctx, int n_seed, bool memory_f16, bool mlock, bool embeddings, bool mmap, bool low_vram, bool vocab_only, int n_gpu_layers, int n_batch, const char *maingpu, const char *tensorsplit, bool numa)
warning: llama_cpp_rs@0.3.0:       |                                                            ~~~~~^~~~~~~~~~
warning: llama_cpp_rs@0.3.0: ./binding.cpp:707:122: warning: unused parameter ‘low_vram’ [-Wunused-parameter]
warning: llama_cpp_rs@0.3.0:   707 | void *load_model(const char *fname, int n_ctx, int n_seed, bool memory_f16, bool mlock, bool embeddings, bool mmap, bool low_vram, bool vocab_only, int n_gpu_layers, int n_batch, const char *maingpu, const char *tensorsplit, bool numa)
warning: llama_cpp_rs@0.3.0:       |                                                                                                                     ~~~~~^~~~~~~~
warning: llama_cpp_rs@0.3.0: ar: /home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3277e23a8be05b2d/out/llama.cpp/ggml.o: No such file or directory

error: failed to run custom build command for `llama_cpp_rs v0.3.0 (/home/piper/rust-llama/mdrokz/rust-llama.cpp)`

Caused by:
  process didn't exit successfully: `/home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-c8043e6c1872204c/build-script-build` (exit status: 1)
  --- stdout
  cargo:rerun-if-env-changed=TARGET
  cargo:rerun-if-env-changed=BINDGEN_EXTRA_CLANG_ARGS_x86_64-unknown-linux-gnu
  cargo:rerun-if-env-changed=BINDGEN_EXTRA_CLANG_ARGS_x86_64_unknown_linux_gnu
  cargo:rerun-if-env-changed=BINDGEN_EXTRA_CLANG_ARGS
  cargo:rerun-if-changed=/usr/include/clang/14.0.6/include/stdbool.h
  TARGET = Some("x86_64-unknown-linux-gnu")
  OPT_LEVEL = Some("0")
  HOST = Some("x86_64-unknown-linux-gnu")
  cargo:rerun-if-env-changed=CC_x86_64-unknown-linux-gnu
  CC_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=CC_x86_64_unknown_linux_gnu
  CC_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_CC
  HOST_CC = None
  cargo:rerun-if-env-changed=CC
  CC = None
  cargo:rerun-if-env-changed=CC_ENABLE_DEBUG_OUTPUT
  cargo:warning=Compiler version doesn't include clang or GCC: "cc" "--version"
  cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
  CRATE_CC_NO_DEFAULTS = None
  DEBUG = Some("true")
  CARGO_CFG_TARGET_FEATURE = Some("fxsr,sse,sse2")
  cargo:rerun-if-env-changed=CFLAGS_x86_64-unknown-linux-gnu
  CFLAGS_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=CFLAGS_x86_64_unknown_linux_gnu
  CFLAGS_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_CFLAGS
  HOST_CFLAGS = None
  cargo:rerun-if-env-changed=CFLAGS
  CFLAGS = None
  cargo:warning=./llama.cpp/ggml.c:17433:13: warning: ‘ggml_opt_get_grad’ defined but not used [-Wunused-function]
  cargo:warning=17433 | static void ggml_opt_get_grad(int np, struct ggml_tensor * const ps[], float * g) {
  cargo:warning=      |             ^~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/ggml-backend.c:836:13: warning: ‘sched_print_assignments’ defined but not used [-Wunused-function]
  cargo:warning=  836 | static void sched_print_assignments(ggml_backend_sched_t sched, struct ggml_cgraph * graph) {
  cargo:warning=      |             ^~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/ggml-quants.c:1337:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function]
  cargo:warning= 1337 | static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
  cargo:warning=      |              ^~~~~~~~~~~~~~~~
  cargo:rerun-if-env-changed=AR_x86_64-unknown-linux-gnu
  AR_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=AR_x86_64_unknown_linux_gnu
  AR_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_AR
  HOST_AR = None
  cargo:rerun-if-env-changed=AR
  AR = None
  cargo:rerun-if-env-changed=ARFLAGS_x86_64-unknown-linux-gnu
  ARFLAGS_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=ARFLAGS_x86_64_unknown_linux_gnu
  ARFLAGS_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_ARFLAGS
  HOST_ARFLAGS = None
  cargo:rerun-if-env-changed=ARFLAGS
  ARFLAGS = None
  cargo:rustc-link-lib=static=ggml
  cargo:rustc-link-search=native=/home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3277e23a8be05b2d/out
  TARGET = Some("x86_64-unknown-linux-gnu")
  OPT_LEVEL = Some("0")
  HOST = Some("x86_64-unknown-linux-gnu")
  cargo:rerun-if-env-changed=CXX_x86_64-unknown-linux-gnu
  CXX_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=CXX_x86_64_unknown_linux_gnu
  CXX_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_CXX
  HOST_CXX = None
  cargo:rerun-if-env-changed=CXX
  CXX = None
  cargo:rerun-if-env-changed=CC_ENABLE_DEBUG_OUTPUT
  cargo:warning=Compiler version doesn't include clang or GCC: "c++" "--version"
  cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
  CRATE_CC_NO_DEFAULTS = None
  DEBUG = Some("true")
  CARGO_CFG_TARGET_FEATURE = Some("fxsr,sse,sse2")
  cargo:rerun-if-env-changed=CXXFLAGS_x86_64-unknown-linux-gnu
  CXXFLAGS_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=CXXFLAGS_x86_64_unknown_linux_gnu
  CXXFLAGS_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_CXXFLAGS
  HOST_CXXFLAGS = None
  cargo:rerun-if-env-changed=CXXFLAGS
  CXXFLAGS = None
  cargo:warning=./binding.cpp: In function ‘int get_embeddings(void*, void*, float*)’:
  cargo:warning=./binding.cpp:81:23: warning: ‘int llama_eval(llama_context*, llama_token*, int32_t, int)’ is deprecated: use llama_decode() instead [-Wdeprecated-declarations]
  cargo:warning=   81 |         if (llama_eval(ctx, embd_inp.data(), embd_inp.size(), n_past))
  cargo:warning=      |             ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=In file included from ./llama.cpp/common/common.h:5,
  cargo:warning=                 from ./binding.cpp:1:
  cargo:warning=./llama.cpp/llama.h:532:30: note: declared here
  cargo:warning=  532 |     LLAMA_API DEPRECATED(int llama_eval(
  cargo:warning=      |                              ^~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
  cargo:warning=   31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
  cargo:warning=      |                                    ^~~~
  cargo:warning=./binding.cpp: In function ‘int eval(void*, void*, char*)’:
  cargo:warning=./binding.cpp:139:22: warning: ‘int llama_eval(llama_context*, llama_token*, int32_t, int)’ is deprecated: use llama_decode() instead [-Wdeprecated-declarations]
  cargo:warning=  139 |     return llama_eval(ctx, tokens.data(), n_prompt_tokens, n_past);
  cargo:warning=      |            ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:532:30: note: declared here
  cargo:warning=  532 |     LLAMA_API DEPRECATED(int llama_eval(
  cargo:warning=      |                              ^~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
  cargo:warning=   31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
  cargo:warning=      |                                    ^~~~
  cargo:warning=./binding.cpp: In function ‘int llama_predict(void*, void*, char**, bool)’:
  cargo:warning=./binding.cpp:283:19: warning: ‘int llama_eval(llama_context*, llama_token*, int32_t, int)’ is deprecated: use llama_decode() instead [-Wdeprecated-declarations]
  cargo:warning=  283 |         llama_eval(ctx, tmp, 1, 0);
  cargo:warning=      |         ~~~~~~~~~~^~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:532:30: note: declared here
  cargo:warning=  532 |     LLAMA_API DEPRECATED(int llama_eval(
  cargo:warning=      |                              ^~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
  cargo:warning=   31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
  cargo:warning=      |                                    ^~~~
  cargo:warning=./binding.cpp:354:31: warning: ‘int llama_eval(llama_context*, llama_token*, int32_t, int)’ is deprecated: use llama_decode() instead [-Wdeprecated-declarations]
  cargo:warning=  354 |                 if (llama_eval(ctx, &embd[i], n_eval, n_past))
  cargo:warning=      |                     ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:532:30: note: declared here
  cargo:warning=  532 |     LLAMA_API DEPRECATED(int llama_eval(
  cargo:warning=      |                              ^~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
  cargo:warning=   31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
  cargo:warning=      |                                    ^~~~
  cargo:warning=./binding.cpp:438:49: warning: ‘void llama_sample_temperature(llama_context*, llama_token_data_array*, float)’ is deprecated: use llama_sample_temp instead [-Wdeprecated-declarations]
  cargo:warning=  438 |                         llama_sample_temperature(ctx, &candidates_p, temp);
  cargo:warning=      |                         ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:743:31: note: declared here
  cargo:warning=  743 |     LLAMA_API DEPRECATED(void llama_sample_temperature(
  cargo:warning=      |                               ^~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
  cargo:warning=   31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
  cargo:warning=      |                                    ^~~~
  cargo:warning=./binding.cpp:444:49: warning: ‘void llama_sample_temperature(llama_context*, llama_token_data_array*, float)’ is deprecated: use llama_sample_temp instead [-Wdeprecated-declarations]
  cargo:warning=  444 |                         llama_sample_temperature(ctx, &candidates_p, temp);
  cargo:warning=      |                         ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:743:31: note: declared here
  cargo:warning=  743 |     LLAMA_API DEPRECATED(void llama_sample_temperature(
  cargo:warning=      |                               ^~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
  cargo:warning=   31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
  cargo:warning=      |                                    ^~~~
  cargo:warning=./binding.cpp:454:49: warning: ‘void llama_sample_temperature(llama_context*, llama_token_data_array*, float)’ is deprecated: use llama_sample_temp instead [-Wdeprecated-declarations]
  cargo:warning=  454 |                         llama_sample_temperature(ctx, &candidates_p, temp);
  cargo:warning=      |                         ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:743:31: note: declared here
  cargo:warning=  743 |     LLAMA_API DEPRECATED(void llama_sample_temperature(
  cargo:warning=      |                               ^~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./llama.cpp/llama.h:31:36: note: in definition of macro ‘DEPRECATED’
  cargo:warning=   31 | #    define DEPRECATED(func, hint) func __attribute__((deprecated(hint)))
  cargo:warning=      |                                    ^~~~
  cargo:warning=./binding.cpp:473:42: warning: cast from type ‘const char*’ to type ‘char*’ casts away qualifiers [-Wcast-qual]
  cargo:warning=  473 |             if (!tokenCallback(state_pr, (char*)token_str.c_str()))
  cargo:warning=      |                                          ^~~~~~~~~~~~~~~~~~~~~~~~
  cargo:warning=./binding.cpp: In function ‘void* llama_allocate_params(const char*, int, int, int, int, float, float, float, int, bool, bool, int, int, const char**, int, float, float, float, float, int, float, float, bool, const char*, const char*, bool, bool, bool, const char*, const char*, bool)’:
  cargo:warning=./binding.cpp:627:100: warning: unused parameter ‘ignore_eos’ [-Wunused-parameter]
  cargo:warning=  627 |                             float top_p, float temp, float repeat_penalty, int repeat_last_n, bool ignore_eos, bool memory_f16, int n_batch, int n_keep, const char **antiprompt, int antiprompt_count,
  cargo:warning=      |                                                                                               ~~~~~^~~~~~~~~~
  cargo:warning=./binding.cpp:627:117: warning: unused parameter ‘memory_f16’ [-Wunused-parameter]
  cargo:warning=  627 |                             float top_p, float temp, float repeat_penalty, int repeat_last_n, bool ignore_eos, bool memory_f16, int n_batch, int n_keep, const char **antiprompt, int antiprompt_count,
  cargo:warning=      |                                                                                                                ~~~~~^~~~~~~~~~
  cargo:warning=./binding.cpp: In function ‘void* load_model(const char*, int, int, bool, bool, bool, bool, bool, bool, int, int, const char*, const char*, bool)’:
  cargo:warning=./binding.cpp:707:65: warning: unused parameter ‘memory_f16’ [-Wunused-parameter]
  cargo:warning=  707 | void *load_model(const char *fname, int n_ctx, int n_seed, bool memory_f16, bool mlock, bool embeddings, bool mmap, bool low_vram, bool vocab_only, int n_gpu_layers, int n_batch, const char *maingpu, const char *tensorsplit, bool numa)
  cargo:warning=      |                                                            ~~~~~^~~~~~~~~~
  cargo:warning=./binding.cpp:707:122: warning: unused parameter ‘low_vram’ [-Wunused-parameter]
  cargo:warning=  707 | void *load_model(const char *fname, int n_ctx, int n_seed, bool memory_f16, bool mlock, bool embeddings, bool mmap, bool low_vram, bool vocab_only, int n_gpu_layers, int n_batch, const char *maingpu, const char *tensorsplit, bool numa)
  cargo:warning=      |                                                                                                                     ~~~~~^~~~~~~~
  cargo:rerun-if-env-changed=AR_x86_64-unknown-linux-gnu
  AR_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=AR_x86_64_unknown_linux_gnu
  AR_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_AR
  HOST_AR = None
  cargo:rerun-if-env-changed=AR
  AR = None
  cargo:rerun-if-env-changed=ARFLAGS_x86_64-unknown-linux-gnu
  ARFLAGS_x86_64-unknown-linux-gnu = None
  cargo:rerun-if-env-changed=ARFLAGS_x86_64_unknown_linux_gnu
  ARFLAGS_x86_64_unknown_linux_gnu = None
  cargo:rerun-if-env-changed=HOST_ARFLAGS
  HOST_ARFLAGS = None
  cargo:rerun-if-env-changed=ARFLAGS
  ARFLAGS = None
  cargo:warning=ar: /home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3277e23a8be05b2d/out/llama.cpp/ggml.o: No such file or directory

  --- stderr

  error occurred: Command ZERO_AR_DATE="1" "ar" "cq" "/home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3277e23a8be05b2d/out/libbinding.a" "/home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3277e23a8be05b2d/out/073db387043af495-common.o" "/home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3277e23a8be05b2d/out/30b5508d68fcb5a8-llama.o" "/home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3277e23a8be05b2d/out/8f1a5a601f45df90-binding.o" "/home/piper/rust-llama/mdrokz/rust-llama.cpp/examples/basic/target/debug/build/llama_cpp_rs-3277e23a8be05b2d/out/llama.cpp/ggml.o" with args "ar" did not execute successfully (status code exit status: 1).

mdrokz / rust-llama.cpp

Fix compilation w/ metal feature enabled #39