LoRA Support - Githubissues

I am attempting to use DeepSpeed-MII for inference. I am presently using the pipeline approach. This does not seem to support LoRA.

Is there a way I can use DeepSpeed-FastGen with LoRA?

I am hesitant to return to 'DeepSpeed-Inference', since the top of the documentation clearly states: https://www.deepspeed.ai/tutorials/inference-tutorial/

DeepSpeed-Inference v2 is here and it’s called DeepSpeed-FastGen! For the best performance, latest features, and newest model support ...

This makes me concerned that using 'DeepSpeed-Inference' will be phased out and no longer supported.

Please advice how I can move forward with using a model and a LoRA at the same time (without pre-merging).

Thank you.

microsoft / DeepSpeed-MII