This PR sets the groundwork for separating the "serving" and "throughput" benchmarks into separate UI pages/etc. Their data will persist in subfolders of the existing dev/bench folder of the nm-gh-pages branch, and they will have their own separate UI pages. We can easily put a simple index.html page in dev/bench which has links to these separate pages.
With these changes, the currently executed benchmark_serving results will be present at the serving subfolder and the upcoming benchmark_thoughput results will be in a throughput subfolder:
One thing I’d like improved is how the separate files are handled in the BENCHMARK-RESULT job in .github/workflows/nm-benchmark.yml. Since you cannot use a matrix strategy within a step, I opted in the short term for duplicating the steps so that, similar to the existing process, each potential results file will have its own step guarded by the if prop. I could likely make the entire job use a matrix strategy, however, I’d be concerned about the potential of merge conflicts/etc. arising if multiple jobs are trying to push to the nm-gh-pages branch too close to each other.
Additionally:
I inadvertently included a change where I started to use GitHub Action’s output grouping, which I think is generally an improvement, but that can be easily removed. In this isolated usage, it groups all the pip install output during the run benchmarks action into a collapsed-by-default group (screenshot below), which can be clicked to expand and will auto-expand if you use the GitHub UI “Search logs” textbox:
I cleaned up misc “input”-related things in .github/actions/nm-github-action-benchmark/action.yml – the type prop is not valid inside an action (it’s mostly to guide the UI and typically only value for workflow_dispatch input definitions).
This PR sets the groundwork for separating the "serving" and "throughput" benchmarks into separate UI pages/etc. Their data will persist in subfolders of the existing
dev/bench
folder of thenm-gh-pages
branch, and they will have their own separate UI pages. We can easily put a simpleindex.html
page indev/bench
which has links to these separate pages.With these changes, the currently executed
benchmark_serving
results will be present at theserving
subfolder and the upcomingbenchmark_thoughput
results will be in athroughput
subfolder:benchmark_serving
: https://neuralmagic.github.io/nm-vllm/dev/bench/serving/benchmark_throughput
: https://neuralmagic.github.io/nm-vllm/dev/bench/throughput/One thing I’d like improved is how the separate files are handled in the
BENCHMARK-RESULT
job in.github/workflows/nm-benchmark.yml
. Since you cannot use amatrix
strategy within a step, I opted in the short term for duplicating the steps so that, similar to the existing process, each potential results file will have its own step guarded by theif
prop. I could likely make the entire job use amatrix
strategy, however, I’d be concerned about the potential of merge conflicts/etc. arising if multiple jobs are trying to push to thenm-gh-pages
branch too close to each other.Additionally:
pip install
output during therun benchmarks
action into a collapsed-by-default group (screenshot below), which can be clicked to expand and will auto-expand if you use the GitHub UI “Search logs” textbox:.github/actions/nm-github-action-benchmark/action.yml
– thetype
prop is not valid inside an action (it’s mostly to guide the UI and typically only value forworkflow_dispatch
input definitions).