unable run in mac - Githubissues

→ ./llamafile-0.8.1 -ngl 9999 19:57:05

import_cuda_impl: initializing gpu module... extracting /zip/llama.cpp/ggml.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml.h extracting /zip/llamafile/compcap.cu to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/compcap.cu extracting /zip/llamafile/llamafile.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/llamafile.h extracting /zip/llamafile/tinyblas.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/tinyblas.h extracting /zip/llamafile/tinyblas.cu to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/tinyblas.cu extracting /zip/llama.cpp/ggml-impl.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-impl.h extracting /zip/llama.cpp/ggml-cuda.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-cuda.h extracting /zip/llama.cpp/ggml-alloc.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-alloc.h extracting /zip/llama.cpp/ggml-common.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-common.h extracting /zip/llama.cpp/ggml-backend.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-backend.h extracting /zip/llama.cpp/ggml-backend-impl.h to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-backend-impl.h extracting /zip/llama.cpp/ggml-cuda.cu to /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-cuda.cu get_rocm_bin_path: note: amdclang++ not found on $PATH get_rocm_bin_path: note: $HIP_PATH/bin/amdclang++ does not exist get_rocm_bin_path: note: /opt/rocm/bin/amdclang++ does not exist get_rocm_bin_path: note: hipInfo not found on $PATH get_rocm_bin_path: note: $HIP_PATH/bin/hipInfo does not exist get_rocm_bin_path: note: /opt/rocm/bin/hipInfo does not exist get_rocm_bin_path: note: rocminfo not found on $PATH get_rocm_bin_path: note: $HIP_PATH/bin/rocminfo does not exist get_rocm_bin_path: note: /opt/rocm/bin/rocminfo does not exist get_amd_offload_arch_flag: warning: can't find hipInfo/rocminfo commands for AMD GPU detection llamafile_log_command: hipcc -O3 -fPIC -shared -DNDEBUG --offload-arch=native -march=native -mtune=native -DGGML_BUILD=1 -DGGML_SHARED=1 -Wno-return-type -Wno-unused-result -DGGML_USE_HIPBLAS -DGGML_CUDA_MMV_Y=1 -DGGML_MULTIPLATFORM -DGGML_CUDA_DMMV_X=32 -DIGNORE4 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DIGNORE -o /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-rocm.dylib.2rk6fs /var/folders/fs/36zzfymj6vl2ysh42yrzqmt00000gn/T//.llamafile/ggml-cuda.cu -lhipblas -lrocblas hipcc: No such file or directory extract_cuda_dso: note: prebuilt binary /zip/ggml-rocm.dylib not found get_nvcc_path: note: nvcc not found on $PATH get_nvcc_path: note: $CUDA_PATH/bin/nvcc does not exist get_nvcc_path: note: /opt/cuda/bin/nvcc does not exist get_nvcc_path: note: /usr/local/cuda/bin/nvcc does not exist extract_cuda_dso: note: prebuilt binary /zip/ggml-cuda.dylib not found {"function":"server_params_parse","level":"WARN","line":2405,"msg":"Not compiled with GPU offload support, --n-gpu-layers option will be ignored. See main README.md for information on enabling GPU BLAS support","n_gpu_layers":-1,"tid":"9442720","timestamp":1714651038} note: if you have an AMD or NVIDIA GPU then you need to pass -ngl 9999 to enable GPU offloading {"build":1500,"commit":"a30b324","function":"server_cli","level":"INFO","line":2858,"msg":"build info","tid":"9442720","timestamp":1714651038} {"function":"server_cli","level":"INFO","line":2861,"msg":"system info","n_threads":6,"n_threads_batch":-1,"system_info":"AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LAMMAFILE = 1 | ","tid":"9442720","timestamp":1714651038,"total_threads":12} /Users/kelvin/Downloads/llamafile-0.8.1: error: no models/7B/ggml-model-f16.gguf file found in zip archive llama_model_load: error loading model: failed to open models/7B/ggml-model-f16.gguf: No such file or directory llama_load_model_from_file: failed to load model llama_init_from_gpt_params: error: failed to load model 'models/7B/ggml-model-f16.gguf' {"function":"load_model","level":"ERR","line":447,"model":"models/7B/ggml-model-f16.gguf","msg":"unable to load model","tid":"9442720","timestamp":1714651038}

Mozilla-Ocho / llamafile

unable run in mac #390