-
Darwin Feedloops-Mac-Studio-2.local 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:31:00 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6020 arm64
command: python -m llama_cpp.server --model ./…
-
Hey @ericcurtin I was testing out the new changes and I noticed the ramalama container file on quay.io needs to be updated to have llama-simple-chat
Everything worked when I built the container from …
-
Hello
I am running in the following machine.
CPU: 12th Gen Intel(R) Core(TM) i7-12700
RAM: 32GB, speed: 4400MT/s
NVIDIA RTX A2000 12GB
model is:
llama-2-7b-chat.Q6_K.gguf
And it takes a…
-
I want to use awq quantize a model, and use llama.cpp convert to gguf. but I followed the tutorial but got an error:Traceback (most recent call last):
File "/root/ld/ld_project/llama.cpp/convert_m…
-
CMake Error at CMakeLists.txt:16 (add_subdirectory):
add_subdirectory given source "./llama.cpp" which is not an existing
directory.
-
### Describe the bug
Whenever I load up certain GGUFs, I get the above error message in the terminal. I have seen it happen on Bartowski Q8 quant of Llama3 70B Instruct (3-part file) and llama-3-70B-…
-
### Feature request
I want to add the ability to use GGUF BERT models in transformers.
Currently the library does not support this architecture. When I try to load it, I get an error TypeError: Ar…
-
I came across a model on Huggingface that supports Llama3 multimodal [Bunny-Llama-3-8B-V: bunny-llama](https://huggingface.co/BAAI/Bunny-Llama-3-8B-V), and I'd like to be able to deploy it using lla…
xx025 updated
6 months ago
-
# Expected Behavior
The server should cache both the previous prompt and the last generation.
# Current Behavior
The cache misses at the end of the previous prompt, forcing to evaluate the pr…
-
你好,按照教程运行 yarn dev ,出现以下错误:
⨯ ./node_modules/@kwsites/file-exists/dist/src/index.js:6:13
Module not found: Can't resolve 'fs'
https://nextjs.org/docs/messages/module-not-found
Import trace f…