Open TheCodeWrangler opened 5 months ago
Thank you for pointing me to this! Things that this helped clear up (and may help someone in the future).
Starting with .safetensors from hugingface you need to convert them to .bin adaptors
import torch
from safetensors.torch import load_file
torch.save(load_file("adapter_model.safetensors"), "adapter_model.bin")`
Then you need to convert that into and .npy
format by using the examples/hf_lora_convert.py
I would like to send Lora weights through to a compiled tensor rt llm model but am unsure how to load the .bin weights and pass them to Triton. An example of using them and passing in weights would be very helpful