Different Batch Sizes for src_loader and tgt_loader

AmritaBh / ConDA-gen-text-detection

Code for the paper: ConDA: Contrastive Domain Adaptation for AI-generated Text Detection

MIT License

31 stars 0 forks source link

Hi Amrita, thank you for the great work!

I was trying to apply your model in the repo to my own dataset. However, while I was running the training code, I encountered an issue:

The source training data contains 29080 items, while the target training data contains 3832 items. I set batch_size to 256. Therefore, in one of the batch for both src_loader and the tgt_loader, other than having 256 as the batch size, the batch size is less than 256 and different (for tgt_loader, the remaining size is 248). Thus, the size of negatives_mask in SimCLRContrastiveLoss is not compatible with batches of different sizes (e.g. 256 in a source batch, 248 in a target batch). This will cause error in denominator = self.negatives_mask * torch.exp(similarity_matrix / self.temperature). Can I ask how you tackled this issue? Thanks!

AmritaBh / ConDA-gen-text-detection

Different Batch Sizes for src_loader and tgt_loader #2