-
### What is the issue?
qwen2:72b-instruct-q4_K_M produces garbage output:
```
>>> hello.
#:G*:^C
```
Other models in other quantizations work correctly.
Ollama server output:
```
$ ./ol…
-
# 平台(如果交叉编译请再附上交叉编译目标平台): orin-CPU
# Platform(Include target platform as well if cross-compiling): orin-cpu(支持sdot指令)
# 复现概率:较高
# Github Version: MNN tag 2.9.1
# 编译方式:
# Compiling Method
…
-
## Type of issue
- I conducted some benchmarks on Intel Core Ultra 7 155H about 3 months ago using this release: [b2568](https://github.com/ggerganov/llama.cpp/releases/tag/b2568), and these are th…
-
### What is the issue?
Ollama is failing to run on GPU instead it uses CPU. If I force it using `HSA_OVERRIDE_GFX_VERSION=9.0.0` then I get `Error: llama runner process has terminated: signal: abo…
-
### System Info
- CPU architecture: x86_64
- CPU memory: 110GB
- GPU properties:
- GPU Name: NVIDIA A100 80GB PCIe
- Libraries:
- tensorrt-llm==0.11.0.dev2024060400
- CUDA Ver…
-
`ollama` is using `llama.cpp` under the hood.
Can trick `ollama` to use GPU but loading model taking forever.
Procedures:
- Upgrade to ROCm v6
- `export HSA_OVERRIDE_GFX_VERSION=9.0.0`
Lo…
-
All I need is to run ollama3 on an Intel GPU (Arc™ A750) and I follow the steps as described in the IPEX-LLM documentation, but it runs on the CPU. Search engines can't find a solution to the problem.…
-
### What is the issue?
I tried 1xH100 box and got an error during installation. Got the same output from another bigger 2xH100 box too:
```
root@C.11391672:~$ curl -fsSL https://ollama.com/instal…
-
### Describe the issue
I am trying to replicate the following : [https://intel.github.io/intel-extension-for-pytorch/llm/llama3/xpu/](url) . While running the `python run_generation_gpu_woq_for_llama…
-
Historical "what the fuck" is available at https://github.com/JabRef/jabref/pull/11430#issuecomment-2209278098
![image](https://github.com/InAnYan/jabref/assets/73715071/070a8ba1-f3c8-4bbd-b16f-172…