Closed nysp78 closed 1 year ago
Hello, I thought I clarified the question in another issue you posted earlier. Could you specify which part you are still confused with?
The only reason we employed RandomSampler is to have dataloaders for unlabelled and labelled data to be the same length.
I'm referring in the case of the supervised setting when we have only labeled data and no unlabeled. Why we also use a random sampler?
I see. In this case, we use the same dataloader just to have a fair comparison. (To see the effect from unlabelled data given the exact same labelled data.)
Hello,
When you are creating the dataloader for labeled data in the supervised setting, you are using a random sampler that samples a specific amount of labeled data from the total available labeled images.
num_samples = self.batch_size * 200 # for total 40k iterations with 200 epochs
train_l_loader = torch.utils.data.DataLoader(train_l_dataset, batch_size=self.batch_size, sampler=sampler.RandomSampler(data_source=train_l_dataset, replacement=True, num_samples=num_samples), drop_last=True )
Why are you using this sampler and forward pass a subset of labeled images instead of the whole amount of labeled data? Is it an effective training approach?
Thanks