mosaicml / examples

Fast and flexible reference benchmarks
Apache License 2.0
435 stars 124 forks source link

FasterTransformer model handler for the mpt series #340

Closed dskhudia closed 1 year ago

dskhudia commented 1 year ago

FT model handler for our mpt models.

Currently it converts the model from hf checkpoint to FT format on the fly. We may want to use a pre-converted model if the model startup time is unacceptable.

Command I used to run it (Both FasterTransformer and conversion script should be in the pythonpath):

PYTHONPATH=/mnt/workdisk/daya/faster_transformer/FasterTransformer_fork:/mnt/workdisk/daya/llm-foundry python examples/inference-deployments/mpt/mpt_7b_ft_handler.py -i "mosaicml/mpt-7b-instruct" --ft_lib_path ../faster_transformer/FasterTransformer_fork/build/lib/libth_transformer.so
dskhudia commented 1 year ago

Is this deployable? If so, could you include a deployment yaml?

@dakinggg : Yaml coming up after we have an image with FT.

dskhudia commented 1 year ago

@dakinggg : Let us merge this and if there are minor changes we can make them later.

dskhudia commented 1 year ago

Resolving some of the lint issues.