loeweX / Greedy_InfoMax

Code for the paper: Putting An End to End-to-End: Gradient-Isolated Learning of Representations
https://arxiv.org/abs/1905.11786
MIT License
284 stars 36 forks source link

fixes #6 #7

Closed kmalhotra30 closed 4 years ago

kmalhotra30 commented 4 years ago

Fix for the sampling issue (issue #6 )

loeweX commented 4 years ago

Thanks for the pull request!

I was wondering whether you compared the run-time and downstream performance between your method and my code?

Generally speaking, the sampling strategy is quite a bottleneck for contrastive learning. You have to choose a trade-off between having the "right" sampling strategy at the cost of longer training times (see for example args.sampling_method 0) vs. a more approximate sampling strategy that might introduce bias but shortens the training time substantially (e.g. args.sampling_method 1). In my experiments, I always found that having an approximate sampling strategy didn't hurt performance, so I set that one as a default.

I didn't experiment with args.sampling_method 2 (i.e. sampling within the same sequence) that much, so I cannot say what exactly the trade-off is in this setting. But given my experience with methods 0 and 1, I am a bit hesitant to move the default setting towards the "right" sampling strategy with longer training times. If you have results proofing my intuition wrong, I'd be happy to see that and change the code.

kmalhotra30 commented 4 years ago

Hello,

No, I did not compare the runtime and downstream performance since the suggested changes are trivial and comparing runtime and downstream performance would require resources. I would expect the runtime to be more or less similar.

I do understand the tradeoff between sampling strategy 0 vs 1 but I don't think that is comparable to the tradeoff between my method and yours for sampling option 2.

I am working on contrastive learning as well and have taken inspiration from your codebase. Since I found the current code slightly off, I just thought of sending a request with trivial changes.