Closed nischalj10 closed 1 month ago
Paligemma doesn't use chat_template
.
Just pass the text string as is to generate
.
https://github.com/Blaizzy/mlx-vlm/issues/33#issuecomment-2135392535
thanks for the quick revert. i updated the code as suggested.
import mlx.core as mx
from mlx_vlm import load, generate
model_path = "google/paligemma-3b-mix-448"
model, processor = load(model_path)
output = generate(model, processor, "/Users/namanjain/app-data/local-recall/screenshots/1717766288971.png", prompt="describe this screenshot")
print(output)
it takes forever to generate any output. this is not the case with much larger models on my m2 chip. also, the max_tokens param is not configurable and somehow the model generates very few tokens.
Could you share your setup specs?
also, the max_tokens param is not configurable and somehow the model generates very few tokens.
It is configurable, by default it's set to 100 but you can increase it by passing max_tokens
argument to the generate function.
here's my generate function -
output = generate(model, processor, "/Users/namanjain/app-data/local-recall/screenshots/1717766288971.png", prompt="elaborately describe this screenshot. what app or website url is this on?", max_tokens=500)
but the model's response is one worded.
my specs are - M2 Air with 8 GM RAM. However, the GPU isn't fully utilised while inference and there's enough capacity to run the model
A few of things to note about Paligemma:
Recommended reading: https://huggingface.co/blog/paligemma
@Blaizzy were you able to get these models to output bounding boxes? If I do for either of them something like detect cat as the prompt (on a sample image with 2 cats) it either gives
Yes, the model works well for captions and counting objects.
However, indeed there is still a bug when it comes object detection and segmentation.
Sometimes it works and others it doesn't. I haven't managed to pin-point the issue.
Segmentation seems to work better with lower temperatures.
I am trying to run the following code but it is giving error. please assist!