Open hypernovas opened 3 months ago
Haven't seen it before, looks like it's coming from the transformers library. Can you share the image/prompt so I can try to reproduce?
FYI we're also very close to shipping llama.cpp based inference code that will run a lot faster on CPU than the PyTorch implementation. Development on that is going on in the moondream-ggml branch here: https://github.com/vikhyat/moondream/tree/moondream-ggml
Sure, thanks!
from PIL import Image
def resize_image(img, max_dimension=300):
# Calculate the ratio to resize by
ratio = max_dimension / max(img.size)
new_size = (int(img.size[0] * ratio), int(img.size[1] * ratio))
# Resize the image using LANCZOS resampling, recommended for downsampling
return img.resize(new_size, Image.Resampling.LANCZOS)
def run_model(img_path, prompt, scale=4.2):
# Open, convert to RGB, and resize the image
with Image.open(img_path) as img:
rgb_img = img.convert('RGB') # Convert to RGB
resized_img = resize_image(rgb_img)
# Encode and get answer from the model
answer = moondream.answer_question(
moondream.encode_image(resized_img), prompt, tokenizer
)
return answer
# Usage example
img_path = "./test.jpg"
prompt = "Describe any defects"
answer = run_model(img_path, prompt)
display(answer)
The lib versions:
!pip install accelerate==0.32.1 huggingface-hub==0.24.0 Pillow==10.4.0 torch==2.3.1 torchvision==0.18.1 transformers==4.42.2 einops==0.8.0 gradio==4.38.1
!pip install flash-attn==2.6.2 datasets==2.20.0
Hi Vik,
Thanks for all the help! And it works perfectly with
cuda
option. Wondering if you have seen this before while usingcpu
The model is loaded by:
Error: