-
This issue will be used to track compilation failures for migraphx models on CPU and GPU. Compile failures for each model should have a link to an issue with a smaller reproducer in the notes column.
…
-
Hello Sai Kiran,
I did came across your medium [blog](https://medium.com/@kiranspixel/advanced-pii-detection-in-educational-data-using-bert-and-electra-5dc21571b610) on "Advanced PII Detection in E…
-
The current implementation of GPT-J and BERT carries out the prediction in sequential manner. Could the performance of GPT-J and BERT be improved by implementing parallel processing through threads ra…
-
Hi,
This is my attached cm-repro file
[cm-repro.zip](https://github.com/user-attachments/files/16969739/cm-repro.zip)
I'm trying to run the MLPerf Reference Implementation for bert-large at h…
-
When I run the command
cm run script --tags=generate-run-cmds,inference,_find-performance,_all-scenarios --model=bert-99 --implementation=reference --device=cuda --backend=onnxruntime --category=edg…
-
### 🐛 Describe the bug
We are planning upgrading our python environment from 3.8 to 3.10, because pytorch has deprecated python 3.8 recently.
But we found that there are performance gaps between pyt…
-
I installed CM following the guide in https://docs.mlcommons.org/ck/install/ successfully
and then refer to https://docs.mlcommons.org/inference/benchmarks/language/bert/ to run the scripts as belo…
-
(python3-venv) aarch64_sh ~> cm run script --tags=run-mlperf,inference,_find-performance,_full,_r4.1 --model=dlrm_v2-99 --implementation=reference --framework=pytorch --category=datacenter…
-
### Describe the issue
FP16 model inference is slower compared to FP32. Does FP16 inference require additional configuration or just need to convert the model to FP16
### To reproduce
convert onnx …
-
Related to BERT/PyTorch
Describe the bug:
I want to reproduce the inferencing performance with INT8 on T4 or A2, but I don't know how to reproduce and compare with the inferencing performance NV…