roboflow / inference

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
https://inference.roboflow.com
Other
1.12k stars 84 forks source link

Add PaliGemma LoRA #464

Closed probicheaux closed 1 week ago

probicheaux commented 3 weeks ago

Description

Add in class that can perform inference using LoRAs

Type of change

Please delete options that are not relevant.

How has this change been tested, please provide a testcase or example of how you tested the change?

Locally

Any specific deployment considerations

n/a

Docs

probicheaux commented 3 weeks ago
  1. PaliGemma needs transformers>=4.41.1, but requirements.cogvlm.txt and reqiurements.groundingdino.txt pin transformers low. Can we avoid that now?
  2. This change doesn't work with get_model because we don't know a priori if the PaliGemma model is a LoRA or not. How should I handle that? Put something in the model bucket and check for that? Right now, there's a file adapter_config.json that exists if and only if the model is a LoRA. Should I use that file to check which class to load in get_model?
capjamesg commented 3 weeks ago

I have tested this implementation and successfully trained a model with LoRA.

PawelPeczek-Roboflow commented 2 weeks ago

Fine, as long as we resolve this https://github.com/roboflow/inference/pull/464#discussion_r1634304370 we are free to merge, I believe that would only take testing CogVLM and probably setting transformers>=4.41.1

PawelPeczek-Roboflow commented 2 weeks ago

Regarding question 2 form here get_model(...) calls internally get_model_type(...). Would be best if we could have the information responded from API at that level. If not feasible, relying on adapter_config.json is ok, but that probably would take having a single class for LoRA and non-LoRA versions?

PawelPeczek-Roboflow commented 1 week ago

@probicheaux - how we plan to move on with this?

probicheaux commented 1 week ago

@PawelPeczek-Roboflow sorry, I've been super busy. Just fixed the get_model thing by pushing a new model_conversion param that adds peft to lora models. I also tested cogvlm in the new docker container (verifying transformers==4.41.2 and it works fine.