-
Given a Dockerfile for my application:
```dockerfile
FROM ollama/ollama
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
git \
python3 \
python3-…
-
### Version
VisualStudio Code extension
### Suggestion
Please add support for open LLMs compatible with endpoint API for LLM Studio / ollama / etc.
-
## 🐛 Bug
I tried compile CodeLlama-7b-hf for webgpu with q4f32 quantization. However it failed with `wasm-ld: error: initial memory too small, 269728288 bytes needed`.
## To Reproduce
S…
-
### Describe the issue as clearly as possible:
Occasionally when I use outlines it will return a JSON string with invalid JSON. This happens most often when it generates an invalid escape character…
-
I tried to create the serving on my system, but failed with the below error:
(emon_analyzer) [root@SPR-1 emon_data_analyzer]# neuralchat_server start --config_file ./config/neuralchat.yaml
2024-03-1…
-
it's a qwen base model download form hf, can inferencing with llama.cpp(latest version) but can't inferencing on latest version of neural-speed
run_qwen shows error:
```
Loading the bin file with…
-
error in Colab when trying to run Gguf models:
llm = LLM(f"TheBloke/neural-chat-7B-v3-3-GGUF/neural-chat-7b-v3-3.Q4_0.gguf")
template="anything" causes error in many models
Erro details:
[/…
-
I've downloaded TheBloke/Llama-2-7B-Chat-GGUF from huggingface, and I use git lfs pull to download all ggufs.
Since my access to Meta/Llama 2 is not passed yet, I choose using KoboldAI/llama2-toke…
-
Hi , I try deploy llama2 today , and found the issue :
(llama) [root@iZbp1iobggdz6jrvvlgpx4Z llama]# torchrun --nproc_per_node 1 example_text_completion.py \ --ckpt_dir llama-2-7b/ \ --token…
-
Configuration:
```
Edition Windows 11 Home
Version 23H2
Installed on 2/29/2024
OS build 22631.3737
Experience Windows Feature Experience Pack 1000.22700.1009.0
```
Hardware:
```
De…