roboflow / multimodal-maestro

streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, Phi-3.5 Vision
Apache License 2.0
1.21k stars 89 forks source link

Local llm #17

Closed itsPreto closed 3 months ago

itsPreto commented 9 months ago

Adding local llm support

CLAassistant commented 9 months ago

CLA assistant check
All committers have signed the CLA.

SkalskiP commented 9 months ago

Hi @itsPreto 👋🏻 Thanks for your interest in Maestro. Going local is what we want to do!

itsPreto commented 9 months ago

I used what I'm familiar with which is Llama.cpp server with ShareGPT4V-7B. But it should work with any other backend/model.

@SkalskiP you just need to define your custom payload with the parameters you want and then call the prompt_image_local function with this payload and the localhost url.

You can see this in the examples/image.py:


# Custom payload function for local server
def custom_payload_func(image_base64, prompt, system_prompt):
    return {
        "prompt": f"{system_prompt}. USER:[img-12]{prompt}\nASSISTANT:",
        "image_data": [{"data": image_base64, "id": 12}],
        "n_predict": 256,
        "top_p": 0.5,
        "temp": 0.2
    }

# Convert image to base64 and send request to local server
response = prompt_image_local(marked_image, "Find the crowbar", "http://localhost:8080/completion", custom_payload_func)
print(response)