huggingface / api-inference-community

Apache License 2.0
165 stars 61 forks source link

Image-to-text InferenceApi discards params if image is bytes #151

Closed jbdel closed 2 years ago

jbdel commented 2 years ago

Hello,

I'm trying inference-api with this a image-to-text model: ViLMedic/rrg_baseline.

This code works well:

from huggingface_hub.inference_api import InferenceApi
inference = InferenceApi(repo_id="ViLMedic/rrg_baseline", token="xxxx")
print(inference(    
    "https://i.ibb.co/rp8pXv5/3761aae0-255c0808-86d2121b-88ae172f-b7625d50.jpg",
    params={"generate_kwargs": {"num_beams": 2, "num_return_sequences": 2, "max_length": 60}},
))

I receive two captions: perfect!

If i work with bytes (i.e. file from computer):

from huggingface_hub.inference_api import InferenceApi

inference = InferenceApi(repo_id="ViLMedic/rrg_baseline", token="x")
with open("file/on/computer.jpeg", "rb") as f:
    im = f.read()
print(inference(
    params={"generate_kwargs": {"num_beams": 2, "num_return_sequences": 2, "max_length": 60}},
    data=im,
))

then i just get one sentence, its like params is discarded in this case. How could I do ?

Thanks,

Narsil commented 2 years ago

You can probably pass the image as a base64 encoded version of the binary.

im = base64.b64encode(im).decode("utf-8")
inference(im, params={...})

The reason is when sending raw bytes, you cannot send parameters along (since the payload is binary it's not JSON).

@osanseviero maybe for confirmation ?

jbdel commented 2 years ago

@Narsil solution worked.

Thanks for helping,

JB