kekmodel / FixMatch-pytorch

Unofficial PyTorch implementation of "FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence"
MIT License
758 stars 170 forks source link

Purpose of Interleave #36

Closed feras-oughali closed 3 years ago

feras-oughali commented 3 years ago

Just wanted to know the intuition behind the interleave and deinterleave operations. How does this help?

Bekci commented 3 years ago

In this comment it is forwarded to an issue in the original repository of the paper. There it is explained that the reason behind this operation is to obtain the same distribution of the unlabeled and labeled examples in all GPUs. Without interleave the batchnorm will operate on different distributions leading to inconsistent moment.

feras-oughali commented 3 years ago

Thanks

Pexure commented 2 years ago

I can get the point of using interleave when performing multi-gpu training. But here as no DataParallel is involved, the input would not be scattered to different gpus in the forward pass. As for DistributedDataParallel, the batch is already allocated by the DistributedSampler. So Interleave labeled+unlabeled on a single gpu seems redundant?