Incomplete Output from IDEFICS Inference Code

Hello, I've encountered an unusual behavior while running the inference code from the IDEFICS project. Specifically, I was using the inference.py , and I get the following output

0: User: Describe this image. Assistant: An image of two kittens in grass. User: Describe this image. Assistant:

tInterestingly, when I modified the code by removing .to(device) in these lines:

From model = IdeficsForVisionText2Text.from_pretrained(checkpoint, torch_dtype=torch.bfloat16).to(device) To model = IdeficsForVisionText2Text.from_pretrained(checkpoint, torch_dtype=torch.bfloat16) And From inputs = processor(prompts, return_tensors="pt").to(device) To inputs = processor(prompts, return_tensors="pt") I then received the responses. like 0: User: Describe this image. Assistant: An image of two kittens in grass. User: Describe this image. Assistant: An image of a dog wearing glasses. User: Describe this image. Assistant: An image of a dog wearing glasses. User: Describe this image. Assistant: An image of a dog wearing glasses. User: Describe this image. Assistant: An image of a dog wearing glasses.

Could you please help me understand why this change resolves the issue? Any insights or guidance would be greatly appreciated.

huggingface / notebooks

Incomplete Output from IDEFICS Inference Code #444