inference-benchmark Search Results

1000+ results
for inference-benchmark

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

unslothai/unsloth #844

Inference speed so slow on T4

I tried running Nemo-12b 4-bit model on one T4 GPU, but the inference speed is very slow. Additionally, the 'forward' function takes much longer than 'generate'. Is there a speedup benchmark for the…

lullabies777 updated 2 months ago
1
facebookresearch/maskrcnn-benchmark #888

Does maskrcnn benchmark support half precision inference?

## ❓ Questions and Help Does maskrcnn benchmark support half precision inference? If not, what should I add?

apxlwl updated 5 years ago
2
ray-project/ray #47242

[Core] We should make preloading Jemalloc configurable for w…

### Description The PR [#39446](https://github.com/ray-project/ray/pull/39446) disables preloading Jemalloc for workers totally. However, Jemalloc is still useful in some cases, and we could make it …

Myasuka updated 1 month ago
1
huggingface/transformers #33221

What's going on with T5 x torch.compile ?

### System Info Hi Team, First of all huge thanks for all the great work you are doing. Recently, I was benchmarking inference for T5 model on ‪AWS EC2 ( G6E machine with L40 GPU) for batch sizes…

shivance updated 1 week ago
4
InternLM/lmdeploy #1879

[Feature] long context inference optimization

### Motivation This is an interesting blog post [FireAttention V2: 12x faster to make Long Contexts practical for Online Inference](https://fireworks.ai/blog/fireattention-v2-long-context-inference…

zhyncs updated 3 months ago
3
hpcaitech/ColossalAI #5861

[BUG]: Colossal AI failed to load ChatGLM2

### Is there an existing issue for this bug? - [X] I have searched the existing issues ### 🐛 Describe the bug I failed to run ChatGLM model with ColossalAI 0.3.6. backtrace is here ----…

hiprince updated 3 months ago
2
vllm-project/vllm #8176

[Performance]: reproducing vLLM performance benchmark

### Proposal to improve performance _No response_ ### Report of performance regression _No response_ ### Misc discussion on performance To reproduce vLLM's performance benchmark, please…

KuntaiDu updated 3 weeks ago
7
kubeedge/ianvs #94

Large Language Model Edge Benchmark Suite: Implementation on…

**What would you like to be added/modified**: A benchmark suite for large language models deployed at the edge using KubeEdge-Ianvs: 1. Interface Design and Usage Guidelines Document; 2. Implem…

nailtu30 updated 2 months ago
3
huggingface/tgi-gaudi #166

low throughput while using TGI-Gaudi on bigcode/starcoderbas…

### System Info tgi-gaudi docker container built from master branch (4fe871ffaaa62f1a203607078e868fcca962b017) Ubuntu 22.04.3 LTS Gaudi2 HL-SMI Version: hl-1.15.0-fw-48.2.1.1 Driver Version: 1…

vishnumadhu365 updated 3 months ago
1
mlcommons/inference #1860

Benchmarking Question List #1

Hello everyone. I have been using MLperf benchmarks for some time. And I have a small list of questions about them. I am asking them here because I have not found answers in other sources of informat…

Agalakdak updated 1 week ago
3

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for inference-benchmark

1000+ results
for inference-benchmark