Closed ben2002chou closed 7 months ago
hi there,
thanks so much for the question.
I can't Identify where the contrastive loss code is
The reason you cannot find is likely due to for Contrastive Learning, loss is defined in the model, not the training pipeline, due to implementation consideration.
and how the positive and negative samples are defined
A-V are naturally paired data, for a batch of data, say 64, you have 64 audios and 64 videos, there will be only 1 positive a-v pair, and all other 63 are negative.
-Yuan
Thank you so much for your answer!
I have a question regarding the code and the paper. I can't Identify where the contrastive loss code is, and how the positive and negative samples are defined. Looking at your paper SSAST has helped give me a vague idea of how contrastive loss might have been implemented (seemingly by matching masked patches), but I would like to further look into the code and understand your implementation. Could you possibly give me some more explanation on where and how this is implemented?