-
Using validation layer on "Best Practice" for Nvidia GPUs
-
### Proposal to improve performance
_No response_
### Report of performance regression
_No response_
### Misc discussion on performance
---
**Setup Summary for vLLM Benchmarking with Llama…
-
With the update to 6.6.47 the Realtek driver for ethernet devices enabled software IRQ coalescing, which rendered the driver in our use-case practically unsable. For Sculpt 24.10 (#5356) we patch the …
cnuke updated
1 month ago
-
### Extension Version
v2.33.0
### VS Code Version
Version: 1.95.1
Commit: 65edc4939843c90c34d61f4ce11704f09d3e5cb6
Date: 2024-10-31T05:14:54.222Z
Electron: 32.2.1
ElectronBuildId: 1042771…
-
It would be very interesting to learn, how `connlib` actually behaves in the wild. For that, we should implement a light-weight metric tracking (likely using OpenTelemetry). These metrics can then be …
-
A multiblock that is a parallel version of the discharger, as well as having a recipemap to create sculk cores. Solves the issue of charger throughput in Endgame.
The multiblock would unlocked at aro…
-
When scaling the batch size from 1 to a small number, say 8, I'd naively expect the generation performance to scale quite well, as we're still firmly in memory bound regime. But I observe two things …
-
### DynamoDB Migration project
We need a project that helps with the migration of DynamoDB tables. In the DynamoDB Documentation we have a guide on how to do it, but up to date there is no a projec…
-
### Your current environment
```text
Docker container following build_from _source instruction
```
### How would you like to use vllm
I want to experiment how chunked_prefill can increase…
-
## Bug Report
**Describe the bug**
We have inputs one harvests fluent-bit logs and the other is from the a different service /logs/gen0.log(Throughput: 10 logs/min of size 150bytes)
The mem_buf_l…