gaasher / I-JEPA

Implementation of I-JEPA from "Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture"
MIT License
249 stars 24 forks source link

Problems with downstream tasks #10

Closed hendredlorentz closed 4 months ago

hendredlorentz commented 5 months ago

Hello, I am very inspired by your work. Since this model is a self-supervised learning model, it is extremely relevant to my work on human activity recognition. I modified this model to make it suitable for human activity recognition data set, but when I used a small amount of labeled data for training on downstream tasks, I found that the effect was very poor. Can you give me some inspiration to solve this problem? At the same time, I am wondering whether the encoder can really perform gradient backpropagation in this architecture. Since the reconstruction process is at the feature level, it feels like the decoder is being trained all the time, but what is the relationship between the encoder and the decoder?

gaasher commented 4 months ago

I would use more data - this method is super data intensive since it involves SSL. And the encoder and decoder are trained together, and both can perform gradient back propagation.