inference-benchmark Search Results

1000+ results
for inference-benchmark

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel/torch-xpu-ops #701

E2E test XPU out of memory

### 🐛 Describe the bug Out of memory in weekly test, https://github.com/intel/torch-xpu-ops/actions/runs/10218591763 Model list: - [ ] `GPTJForCausalLM` - [ ] `GPTJForQuestionAnswering` - [ ]…

mengfei25 updated 5 hours ago
3
iree-org/iree #18741

Abort (core dumped)

### What happened? For the attached IR seeing abort during runtime. command: ``` iree-compile model.modified.mlir --iree-hal-target-backends=llvm-cpu -o compiled_model.vmfb iree-run-module --modul…

pdhirajkumarprasad updated 3 days ago
5
mlcommons/inference #1866

retinanet run harness fails 'executionContext.cpp::setOptimi…

Trying to run offline retinanet in a container with one Nvidia GPU: cm run script --tags=run-mlperf,inference,_find-performance,_full,_r4.1-dev --model=retinanet --implementation=nvidia …

stbailey001 updated 1 week ago
2
CentML/flexible-inference-bench #42

Latency variable is not initialized in async_request_openai_…

Hi, The `latency` variable is used, but can be undefined: Initialization happens only under this condition: ``` if chunk == "[DONE]": latency = time.perf_counter() - st ``` But it gets ref…

atokayev updated 3 months ago
2
qinzheng93/FD-MobileNet #2

inference time benchmark of different models ?

hi, what about the inference time of the FD_MobileNet ? can you give benchmark of speed about different models mentationed in the paper ? Thanks !

lizhengwei1992 updated 5 years ago
1
bd-iaas-us/vllm #22

[Feature]: Support flashdecoding++ heuristic dataflow and pr…

- Q3 Collaboration Plan of Infra and IaaS Labs: https://bytedance.us.larkoffice.com/docx/HKXfdRh1noMrbAxcgL2ureGasdQ - FlashDecoding++ Summary: https://bytedance.larkoffice.com/wiki/WbqXwRL3qi0x18k…

chizhang118 updated 3 weeks ago
4
bd-iaas-us/vllm #20

[Feature]: Support flashdecoding++ async unified softmax

flashdecoding++ paper: https://arxiv.org/abs/2311.01282 - Q3 Collaboration Plan of Infra and IaaS Labs: https://bytedance.us.larkoffice.com/docx/HKXfdRh1noMrbAxcgL2ureGasdQ - FlashDecoding++ Su…

chizhang118 updated 1 week ago
10
microsoft/GLIP #115

AssertionError: Invalid type <class 'NoneType'> for key

Hi I managed to build GLIP and use the following command to test zero-shot on COCO `python tools/test_grounding_net.py --config-file configs/pretrain/glip_Swin_L.yaml --weight MODEL/glip_large_mode…

charismaticchiu updated 2 months ago
2
kubeedge/ianvs #100

Heterogeneous Multiedge Inference for High Mobility Scenario…

**What would you like to be added/modified**: Based on the current multiedge inference benchmark on ianvs, we would like to extend the multiedge inference on multiple heterogeneous edges (e.g., m…

yunzhe99 updated 2 months ago
2
openvinotoolkit/openvino.genai #820

failed to run Llama-2-7b-chat-hf on NPU through Sample/Pytho…

Dears, I failed to run Llama-2-7b-chat-hf on NPU, please give me a hand. 1. I converted the mode by below command, and got two models, a) optimum-cli export openvino --task text-generation -m Meta-…

aoke79 updated 1 month ago
6

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for inference-benchmark

1000+ results
for inference-benchmark