-
All requests end with 'finish_reason': 'length' when the max_tokens=-1 parameter is set.
What could be the problem?
**Model**:
https://huggingface.co/IlyaGusev/saiga_mistral_7b_gguf/resolve/main/…
-
I think the SLP cost model might be wrong for vector gathers on skylake.
Consider the following code which repeatedly permutes an array:
```
void f(const float *__restrict__ src, const int *__r…
-
### How can we reproduce the crash?
_No response_
### Relevant log output
```shell
--- Bun is auto-restarting due to crash [time: 1722444272358] ---
===============================================…
-
### 🐛 Describe the bug
Hello,
I found that when compiling torch.cumsum with AoTInductor a CUDA illegal memory access error gets thrown when the input tensor is large. This only happens with an AoT c…
-
Hi,
I hope this is not redundant with another issue.
It seems that there might be an issue with passing check point 1 and I am not sure about the reason. No idea if that could impact the final…
-
Here is a example: https://godbolt.org/z/nd9Kc3cqq
-
I try to build dnnl with gpu engine from source , [Requirements for Building from Source](https://github.com/oneapi-src/oneDNN#gpu-engine)
It needs DPC++, TBB, cuDNN, cuBLAS.
The others are already …
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTor…
-
### Your current environment
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (U…
-
### Your current environment
```text
The output of `python collect_env.py`
Collecting environment information...
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12…