Open tatiana-iazykova opened 9 months ago
Maybe, someone can explain to me how to use InferenceEngineV2 instead of v1?
@tohtana
Hi @tatiana-iazykova we do not support mistral
models with the old inference engine. In order to use the latest inference engine, please take a look at DeepSpeed-MII. This is the front end for inference now. Here's an example for running a mistral model:
import mii
pipe = mii.pipeline("mistralai/Mistral-7B-v0.1")
response = pipe(["DeepSpeed is", "Seattle is"], max_new_tokens=128)
print(response)
Describe the bug mistral doesn't fully convert to deepspeed format despite support in v2 module
To Reproduce
Expected behavior convertion to deepspeed attention modules
ds_report output
Screenshots If applicable, add screenshots to help explain your problem.
System info (please complete the following information):
Docker context No
Additional context Add any other context about the problem here.