Summary of Changes

Add a fake backend for testing
Use mlcommons app to run benchmarks

Motivation

MLCommons (mlperf) is a standard benchmark suite for machine learning inference. In particular, their loadgen library can send requests and report performance statistics.

Implementation

The fake backend is just copies the request it receives and forwards it. It has two load-time parameters to control how long to sleep or do busy work for to simulate load.

For MLCommons, I've added some Python scripts that wrap around the existing MLCommons app and runs it in a variety of configurations. The results are saved to a binary file and then can be analyzed by another script, which prints out the raw data and generates graphs. I'm using plotly to create graphs which can be beautifully rendered in the documentation with sphinx-charts.

Notes

N/A

Xilinx / inference-server

Use MLCommons app for benchmarks #197

Summary of Changes

Motivation

Implementation

Notes