-
The Parboil benchmarks are a set of throughput computing applications...Each benchmark includes several implementations. Some implementations we provide as readable base implementations from which new…
-
NVCC can generate machine code for multiple compute capabilities (a fat binary), but Kokkos CMake list does not let me specify multiple CUDA architectures:
```
CMake Error at cmake/kokkos_arch.cma…
-
## Overview
- Intel's Lunar Lake is releasing soon, which has CPU, NPU and iGPU in a single chip
## Tasklist
- [x] https://github.com/janhq/cortex.cpp/issues/677
- [x] https://github.com/janhq/cort…
-
## What is pointer tagging?
I think on most (or quite possibly all) Oxc's supported 64-bit architectures, the top 6 or 7 bits of pointers are unused, and could be used to pack additional data into …
-
### OpenVINO Version
2024.2.0
### Operating System
Other (Please specify in description)
### Device used for inference
GPU
### Framework
PyTorch
### Model used
_No response_
### Issue descri…
-
```py
import ctypes
@ctypes.CFUNCTYPE(
None,
)
def test():
return None
```
Perhaps the `libffi` version should be raised higher.
```
3.4.6 Feb-18-2024
Fix long double regr…
qTich updated
4 months ago
-
On newer x86 cpus (amd and intel) 3 operand LEA instructions with base, index and offset have a higher latency and less throughput than 2 operand LEA instructions.
The compiler when emitting the i…
-
### Motivation
Set internvl as an example, it's vision model is 6B. If the vision model can be quantilized, the inference process can be done in only one 4090.
请问目前vision model不支持量化的原因,是因为feature暂时还…
-
### Is this a unique feature?
- [X] I have checked "open" AND "closed" issues and this is not a duplicate
### Is your feature request related to a problem/unavailable functionality? Please descr…
-
This page is accessible via [roadmap.vllm.ai](https://roadmap.vllm.ai)
### Themes.
As before, we categorized our roadmap into 6 broad themes: broad model support, wide hardware coverage, state of…