google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Apache License 2.0
2.04k stars 140 forks source link

[QUESTION] How to perform inference on trained model? #106

Open sleepingcat4 opened 2 months ago

sleepingcat4 commented 2 months ago

I have followed-thru a tutorial posted by Roboflow. While it was clear and helpful for learning PaliGemma, I had been struggling to figure out, how inference on the trained model (.npz) file can be done. Can someone provide some pointers and help?

https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/how-to-finetune-paligemma-on-detection-dataset.ipynb?ref=blog.roboflow.com#scrollTo=TGDFTYVnY4zn

Naziyashaik09 commented 1 month ago

@sleepingcat4 Am also looking for the same have you got anything on this. Please let me know if you got anything.

sleepingcat4 commented 1 month ago

@Naziyashaik09 I've actually solved this problem.

Naziyashaik09 commented 1 month ago

@sleepingcat4 Please can you share me the code or the notebook link.

akolesnikoff commented 1 month ago

The main colab has examples of how to tune the model and make predictions (aka inference): https://colab.research.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/paligemma/finetune_paligemma.ipynb.

Is there something specific that is unclear?

gadhane commented 1 month ago

@Naziyashaik09 I've actually solved this problem.

Can you please share on how you address the issue? I tried the one provided on the original code, but it is not working. We would appreciate if you could give some ideas or if possible share the code. Thank you.

sleepingcat4 commented 1 month ago

It's not difficult. Change the model and tokenizer paths to your trained model and tokenizer path in the beginning and done. Although if you want to just keep the inference code that's slightly tricky. Since, PaliGemma uses Jax sharding which needs to be kept unless you can load the entire model on your GPU or Colab. @gadhane @Naziyashaik09

sleepingcat4 commented 1 month ago

@akolesnikoff actually no. Unfortunately, off-the boat it is not apparent we just need to change the model and tokenizer paths LOL!

EtremelyKeep commented 1 month ago

The main colab has examples of how to tune the model and make predictions (aka inference): https://colab.research.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/paligemma/finetune_paligemma.ipynb.

Is there something specific that is unclear?

the dataset of fine-tuning paligemma is missing, and the longcap100 dataset is now inaccessible from kaggle.