johko / computer-vision-course

This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord
MIT License
372 stars 123 forks source link

Unit 4, Introduction: Fusion of Text and Vision #126

Closed snehilsanyal closed 6 months ago

snehilsanyal commented 6 months ago

Hey everyone 🤗

This PR adds the Introduction chapter on Fusion of Text and Vision for Unit 4: Multimodal Models. Related to Issue: #54
Already reviewed by: @suryakrishna02, @charchit7

Best, Fusion of Text and Vision Team.

charchit7 commented 6 months ago

One last small edit : add Liaon dataset : https://laion.ai/blog/laion-5b/ This was one of the very big contribution which led to Stable Diffusion.

snehilsanyal commented 6 months ago

One last small edit : add Liaon dataset : https://laion.ai/blog/laion-5b/ This was one of the very big contribution which led to Stable Diffusion.

Sure @charchit7 I will add this, I also came across some more models like Kosmos-2 Might be good to have a good demo/space for this.

charchit7 commented 6 months ago

Yup, @snehilsanyal nice!

ratan commented 6 months ago

looking good.@snehilsanyal

snehilsanyal commented 6 months ago

Super cool and comprehensive, I only left formatting suggestions!

Done @merveenoyan 🤗

snehilsanyal commented 6 months ago

I think this is almost ready to merge!

Done @merveenoyan 🤗 removed the previous example and added a simpler one 😄