Inference benchmarking - Githubissues

bcui19 commented 1 year ago

Below we include a script, .yamls, and a README for benchmarking deepspeed inference.

abhi-mosaic commented 1 year ago

@bcui19 I think we should have a quick sync on what inference workflow we want to encourage here. I think we should refactor these scripts to load HF models only, not Composer checkpoints, and benchmarking should support either either raw HF generate, or the DeepSpeed inference wrapper. I don't think we want to reference any training YAMLs or ComposerMosaicGPT models when we get to inference time.

Basically the export flow I want to encourage is: (Training YAML + Composer ckpt) -> (HF folder with model and tokenizer inside) -> optionally ONNX.

Checkout this JIRA for more details: https://mosaicml.atlassian.net/browse/RESEARCH-589

abhi-mosaic commented 1 year ago

If you want to add deepspeed install instructions, you could try putting a requirement.txt here with the pinned version you want, and then adding a comment to use it in the README

mosaicml / examples

Inference benchmarking #267