-
Hi,
Thanks for the benchmark! I noticed that the benchmark uses `timeit`, and each iteration repeatedly creates a new gRPC channel:
https://github.com/llucax/python-grpc-benchmark/blob/94a175fbc…
-
When multiple batch requests are started simultaneously it seems only the first (few) requests run at full speed. The rest of the requests run at a reduced speed, only using part of the resources avai…
-
### Expected behavior
_No response_
### Actual behavior
_No response_
### Steps to reproduce the problem
For example,
1、When I set target throughput under 30000(500TPS),everything works fine. Bu…
-
### Expected Behavior
The throughput should always be correct
### Current Behavior
Throughput values from the Pipeline rules toggle from positive to negative
### Steps to Reproduce
1. You n…
-
Do you have a plan to include metrics like latency of generation and throughput (tokens/sec) in the evaluation? I think this would be a good addition. Having these system evaluations will surely help …
-
Hi
what's the roadmap of the features and any site about the throughput data?
Thanks
-
I'm training models with the below specs but seeing major throughput drop when switching to GLU - Do you know why? / Ideas what I could investigate? Thanks a lot! cc @mvpatel2000 @tgale96
```
activ…
-
### Motivation
**PD disaggregation** is said to provide an 2-4x throughput improvement. See [DistServe](https://hao-ai-lab.github.io/blogs/distserve/) for reference. They are planning to integrate th…
-
Using validation layer on "Best Practice" for Nvidia GPUs
-
Full context here: https://www.notion.so/hyperlanexyz/Improving-Relayer-Throughput-43462a3f52d547458e55d9c66ac93e2f
We noticed that the relayer cannot submit messages as fast as new ones arrive. This…