Open pantDevesh opened 3 weeks ago
You can do something like this
from mistral_inference.model import Transformer
model = Transformer.from_folder(args.model_path, device=f"cuda:0")
model.load_lora("/path/to/lora.safetensors", device=f"cuda:0")
safetensors.torch.save_model(model, "/path/to/merged.safetensors")
How to perform inference with a LoRA model using Python code, if save_adapters = True?
Is there a script for this?