-
I deployed codellama on triton, but there is no space between each words on the stream mode
```
python openai_completion_stream.py
beyondthecallofdutytosavehisfellowsoldiersafterthey�sufferedaho…
-
The code llama experiment I conducted resulted in CodeLlama-7b-hf, and the dataset I selected was human eval
'pass_1': 0.2557317073170731, while the paper is' pass_1 ': 0.335, and then my parameter…
-
我一直使用xinfernece部署Qwen1.5-14b 4位量化模型,模型启动占用13G左右,几轮对话后会达到22 G左右,清除对话会显存占用会下降。今天更新到10.1版本,同样的模型启动显存占用直接冲高到20G,如果连接到知识库2~3轮对话后显存就会溢出,并且清除对话后显存占用也不会下降。
-
add smth like this
https://github.com/cocogitto/cocogitto/blob/72e1f8624939a089414aa28418bd2249972fac23/Cargo.toml#L61
-
It will return `Error: special tags are not allowed as part of the prompt.` as an error. No settings adjusted, completely fresh instance of vscode and the continue extension
-
I'm encountering a KeyError when trying to train Phi-3 using the unsloth library. The error occurs during the generation step with model.generate. Below are the details of the code and the error trace…
-
Hi there,
I was just trying to run ollama on Windows but the API somehow does not work.
![image](https://github.com/ollama/ollama/assets/25198837/e752df03-3200-4023-a9b6-05e95d91c8be)
![image](h…
-
### Is there an existing issue for the same bug?
- [X] I have checked the troubleshooting document at https://opendevin.github.io/OpenDevin/modules/usage/troubleshooting
- [X] I have checked the exis…
cope updated
5 months ago
-
I just test using only cpu to lanch LLMs,however it only takes 4cpu busy 100% of the vmware, others still 0%
-
**Describe the bug**
A clear and concise description of what the bug is.
**Information about your version**
Please provide output of `tabby --version`
**Information about your GPU**
Please pr…