-
### Cortex version
Jan v0.5.4
### Describe the Bug
https://discord.com/channels/1107178041848909847/1296496734901375146
Hi, when I try to use my AMD GPU, with vulkan, I get the failed to load mo…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
Latest LLaMA-Factory repo 12Septr2024 forces to use Torch 2.4 hence is clashing with Unsloth/XFormers
##…
-
I am building an RAG application and the following error is blocking my deployment.
`Error` message received:
```
react-dom.development.js:26793 Uncaught (in promise) Error: Maximum update depth ex…
-
Hello everyone, I have a problem and would like to ask for help. After I compile and run the inference code run.py, if I set max_output_len to a small value, the output will be truncated before it is …
-
### Description
I am using llama3.1 model using `@ai-sdk/openai` package.
I get this error
```
{"object":"error","message":"[{'type': 'value_error', 'loc': ('body',), 'msg': 'Value error, Curren…
-
in the compress.py file
```
generated_tokens = model.generate(
max_new_tokens=max_new_tokens,
num_return_sequences=3, # Fixed to generate 3 samples
do_sample=True,
…
-
hi there, I am using a 8Gen3(Xiaomi14 Pro 68GB/s bw) and following the Android Cross Compilation Guidance Option.1: Use Prebuilt Kernels guide to test llama-2-7b-4bit token generation performance.
it…
-
In the "The Llama 3 Herd of Models" paper, FFN dimension for the 8B, 70B and 405B models are stated as 6,144, 12,288 and 20,480. I would have expected the parameter count to stay the same as llama 3 w…
-
(base) C:\Users\m>pip install llama-cpp-python
Collecting llama-cpp-python
Using cached llama_cpp_python-0.2.85.tar.gz (49.3 MB)
Installing build dependencies ... done
Getting requirements t…
-
I am following this tutorial to use function call with qwen2:0.5b https://github.com/abetlen/llama-cpp-python/blob/main/examples/notebooks/Functions.ipynb
I used this command to start a serve ins…