johko / computer-vision-course

This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord
MIT License
365 stars 121 forks source link

add notebook examples #294

Open mohammad-gh009 opened 1 month ago

mohammad-gh009 commented 1 month ago

https://github.com/johko/computer-vision-course/blob/main/chapters/en/unit4/multimodal-models/transfer_learning.mdx

Task Description Model Notebook
Fine-tune CLIP Fine-tuning CLIP on a custom dataset openai/clip-vit-base-patch32 CLIP notebook
VQA Answering a question in natural
language based on an image
dandelin/vilt-b32-finetuned-vqa VQA notebook
Image-to-Text Describing an image in natural language Salesforce/blip-image-captioning-large Text 2 Image notebook
Open-set object detection Detect objects by natural language input Grounding DINO Grounding DINO notebook
Assistant (GTP-4V like) Instruction tuning in the multimodal field LLaVA LLaVa notebook