-
Since generation speed is almost matching llama.cpp after https://github.com/EricLBuehler/mistral.rs/pull/152 I think it's worth it trying to optimize prompt processing now.
-
Hello! Thanks for open-sourcing the repository! I'm learning to write cuda code. So I think I learnt a lot here.
I have two questions.
1. In your paper 5.3, you mentioned the cuda kernels are diff…
-
### Your current environment
H100 (but I believe it happens in any machine)
### 🐛 Describe the bug
```
--enable-chunked-prefill --num-max-batched-tokens 2048 --kv-cache-dtype "fp8"
```
S…
-
以下为错误信息, 运行环境是 mindspore 2.10, mindformers 1.10(dev branch), GPU, Nvidai a6000.
Traceback (most recent call last):
File "/home/hz04/mindrlhf/train.py", line 103, in
run_rlhf(args)
File…
zhz44 updated
7 months ago
-
**Please acknowledge the following before creating a ticket**
- [Yes] I have read the GitHub issues section of [REPORTING-BUGS](../blob/master/REPORTING-BUGS).
**Description of the bug:**
I run…
-
Hi, I'm using the optimum-benchmark for onnxruntime backend (cpu), but for larger batch sizes I get negative values which doesn't seem correct - is this a known bug?
This is the param setup:
```…
-
Hi, thanks for your great work!
And I'm trying to use custom dataset to train and test StreamPETR. The custom dataset is 10Hz, but frequency of key frames of nuscenes is 2Hz. I noticed that others ha…
-
## Issue Description ##
Initiating Uplink TCP will always result in RLF and UE drop.
## Setup Details ##
Last Build of Srsran 23.10.1
Benetel Radio 1.0.4
## Expected Behavior ##
To wor…
-
```
[2024-01-31T20:47:58.664Z] =========================== short test summary info ============================
[2024-01-31T20:47:58.664Z] FAILED ../../src/main/python/json_test.py::test_from_json_m…
jlowe updated
2 months ago
-
I tested Mistral-7B-v0.1 on 5 different benchmarks using the platform, but I encountered some issues.
When testing Squadv2 and 2shots, there was no problem with VLLM during generation, but it reporte…