Open bagelbig opened 2 months ago
I am attempting to use DeepSpeed-MII for inference. I am presently using the pipeline approach. This does not seem to support LoRA.
Is there a way I can use DeepSpeed-FastGen with LoRA?
I am hesitant to return to 'DeepSpeed-Inference', since the top of the documentation clearly states: https://www.deepspeed.ai/tutorials/inference-tutorial/
DeepSpeed-Inference v2 is here and it’s called DeepSpeed-FastGen! For the best performance, latest features, and newest model support ...
This makes me concerned that using 'DeepSpeed-Inference' will be phased out and no longer supported.
Please advice how I can move forward with using a model and a LoRA at the same time (without pre-merging).
Thank you.
I am attempting to use DeepSpeed-MII for inference. I am presently using the pipeline approach. This does not seem to support LoRA.
Is there a way I can use DeepSpeed-FastGen with LoRA?
I am hesitant to return to 'DeepSpeed-Inference', since the top of the documentation clearly states: https://www.deepspeed.ai/tutorials/inference-tutorial/
DeepSpeed-Inference v2 is here and it’s called DeepSpeed-FastGen! For the best performance, latest features, and newest model support ...
This makes me concerned that using 'DeepSpeed-Inference' will be phased out and no longer supported.
Please advice how I can move forward with using a model and a LoRA at the same time (without pre-merging).
Thank you.