johko / computer-vision-course

This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord
MIT License
372 stars 123 forks source link

Unit 4: Fusion Text and Vision - Tasks and Models for Image and Text. #151

Closed SuryaKrishna02 closed 6 months ago

SuryaKrishna02 commented 6 months ago

Hey everyone!

This PR adds the Part 1 of the Second Section on Fusion of Text and Vision for Unit 4: Multimodal Models. introducing the Multimodal Tasks and Models involving Image and Text. Related to Issue: https://github.com/johko/computer-vision-course/issues/54

Best, Fusion of Text and Vision Team.

snehilsanyal commented 6 months ago

@SuryaKrishna02 amazing read, nice introduction and well written 🤗 Waiting for the next sections.

SuryaKrishna02 commented 6 months ago

@merveenoyan Thanks for your comments. I have made those changes and completed the rest of the section. Looking forward to your review.

merveenoyan commented 6 months ago

@SuryaKrishna02 can you fix merge conflicts and we can merge?

SuryaKrishna02 commented 6 months ago

@merveenoyan Fixed the merge conflicts.