-
### What is the issue?
This seems to be an issue with the kv cache on Nvidia/AMD GPUs. See https://github.com/ggerganov/llama.cpp/issues/2838
-
`rocm-smi` works fine.
The following was run on a 4x GPU System:
```
$ docker run -it --rm --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 16G rocm/rocm-terminal:lat…
-
We refer to [https://github.com/ROCm/flash-attention](url) to install flash_attn with ROCm support (the current highest version is 2.0.4). When we need to do long context inference (using LongLoRA), s…
-
I am trying to install ginkgo on Ubuntu 22.04. I have an up-to-date default installation of AMD ROCm 6.1.1 which works fine. The ginkgo installation process using cmake as described on the webpage, do…
-
it always complains link error: when run "cargo test "
/usr/bin/ld: /opt/rocm/lib//libamdhip64.so: undefined reference to `hsa_amd_memory_copy_engine_status@ROCR_1'
/usr/bin/ld: /opt/rocm/…
-
### Bug description
```
ROCm libraries detected - building dynamic ROCm library
+ '[' -f '/usr/lib/rocm/lib/librocblas.so.*.*.?????' ']'
+ init_vars
+ case "${GOARCH}" in
+ ARCH=x86_64
+ LLAM…
-
The `cuda_lt.sh` script contains a `--offload-arch=native` flag for amdclang:
https://github.com/openucx/ucc/blob/c1734db1b2bc9ffeba5d17b3e81e1a9425dee100/cuda_lt.sh#L31
This should select the n…
-
Can we please have ROCm 6.0?
-
Example of command:
```python benchmark_throughput.py --model gpt2 --input-len 256 --output-len 256```
Output:
```Namespace(backend='vllm', dataset=None, input_len=256, output_len=256, model='gpt…
rlrs updated
2 months ago
-
I have been informed that while Flash Attention's there it's not being used -
https://github.com/oobabooga/text-generation-webui/issues/3759#issuecomment-2031180332
The post has a link to what has …