EricLBuehler / mistral.rs

Blazingly fast LLM inference.
MIT License
2.97k stars 213 forks source link

Support for T5 Architecture #384

Open niranjanakella opened 1 month ago

niranjanakella commented 1 month ago

Hello @EricLBuehler, opening this issue as part of T5 Seq2Seq model architecture support in mistral.rs. (As discussed)

Relates to: #156

EricLBuehler commented 1 month ago

Hi @niranjanakella!

Thank you for opening this issue. Just to clarify, would this be a quantized or nonquantized implementation?

niranjanakella commented 1 month ago

@EricLBuehler Non-Quantized f16,32 implementation currently holds more precedence. But if possible, would also like to have a quantized implementation too.

Also I wish to know if LoRA adapters can be loaded at runtime without merging them into the model. It would be a huge game changer for most applications given the fact that many developers train multiple adapters. Would be great to attach multiple adapters during runtime.

EricLBuehler commented 1 month ago

Non-Quantized f16,32 implementation currently holds more precedence. But if possible, would also like to have a quantized implementation too.

Sounds great, I'll get started on an implementation.

Also I wish to know if LoRA adapters can be loaded at runtime without merging them into the model. It would be a huge game changer for most applications given the fact that many developers train multiple adapters. Would be great to attach multiple adapters during runtime.

We actually have this feature already! There are 2 ways to do this: 1) Activate adapters at runtime by preloading some and then sending requests to activate adapters 2) Use per-request adapter specification to have granular control.

Docs: https://github.com/EricLBuehler/mistral.rs/blob/master/docs/ADAPTER_MODELS.md#adapter-model-dynamic-adapter-activation.

EricLBuehler commented 3 weeks ago

Hi @niranjanakella! Sorry for the delay; I have been busy with the Idefics 2 implementation (#309). I should have a prototype ready tonight, though!

niranjanakella commented 3 weeks ago

@EricLBuehler No problem sounds good. I am looking forward to trying it out soon.

EricLBuehler commented 3 weeks ago

See: #432.