Closed lihao0374 closed 4 years ago
As in the paper, it would help improve from 63.6% to 65.9% with 100 epochs training with a linear projection head.
No, those hyper-parameters are NOT tuned in the InfoMin
paper. For the Color Jittering, it was tuned in SimCLR; for the crop_size of 0.2, it was used in MoCo, where they say they followed InstDis and I don't know whether they have tuned it.
So in short, I just plot what is happening behind these parameter choices provided by previous papers.
I would close it for now, but please feel free to reopen it if you have further questions.
As in the paper, it would help improve from 63.6% to 65.9% with 100 epochs training with linear projection head.
No, those hyper-parameters are NOT tuned in the
InfoMin
paper. For the Color Jittering, it was tuned in SimCLR; for the crop_size of 0.2, it was used in MoCo, where they say they followed InstDis, but I feel they might have tuned it. So in short, I just plot what is happening behind these parameter choices provided by previous papers.I would close it for now, but please feel free to reopen it if you have further questions.
Thanks for your reply ~
No worry at all :) It’s actually a reasonable question!
Jigsaw improves it from 63.6 to 65.9. But I haven’t ablated it in the full model
No worry at all :) It’s actually a reasonable question!
Jigsaw improves it from 63.6 to 65.9. But I haven’t ablated it in the full model
Sorry to bother you again, could you tell me how to get the optimal parameters N=2 and M=10 in RA?
Please refer to here: #3
Please refer to here: #3
Many thanks !
Good job! I was wondering how much gain can Jigsaw bring ?