gaasher / I-JEPA

Implementation of I-JEPA from "Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture"
MIT License
249 stars 24 forks source link

how much resource do you need to train I-JEPA from scratch? #12

Closed Uljibuh closed 5 months ago

Uljibuh commented 5 months ago

thank you for your implementation, your code is clean and easy to read.

I am wondering how much dataset and GPU you need to train a small and accurate I-JEPA?

gaasher commented 5 months ago

I'd need more information on desired encoder size, and # of images in your dataset. Generally speaking, transformer-based methods like I-JEPA are data intensive so require lots of resources. Furthermore, I-JEPA will need a self-supervised step which also makes it more data/compute hungry.