microsoft / DeepSpeedExamples

Example models using DeepSpeed
Apache License 2.0
6.02k stars 1.02k forks source link

Update Inference Benchmarking Scripts - Support AML #868

Closed lekurile closed 7 months ago

lekurile commented 7 months ago

This PR fixes/updates the inference benchmarking analysis scripts to support [fastgen, vllm, aml] backends. The scripts are generalized to support models beyond just Llama, which was hardcoded in the scripts previously. A number of bugs and formatting issues are also resolved. The scripts that were fixed/updated are:

Example plots for the scripts:

lekurile commented 7 months ago

Do you have an example that shows how to run them?

@tohtana, thank you for the comment, I will update the README w/ examples showing how to run the scripts.