Open gymbeijing opened 9 months ago
Hi, thank you for using our work! Lines 263 to 269 in contrast_training_with_da.py handles this issue: https://github.com/AmritaBh/ConDA-gen-text-detection/blob/676aaf313ea9aec756a2474f6c33a22f4f1f2c1f/contrast_training_with_da.py#L263
Your case would be handled by this line: https://github.com/AmritaBh/ConDA-gen-text-detection/blob/676aaf313ea9aec756a2474f6c33a22f4f1f2c1f/contrast_training_with_da.py#L269
Basically, we re-iterate over the dataset with the smaller size. Let me know if this helps or if you have any other issues.
Hi Amrita, thank you for the great work!
I was trying to apply your model in the repo to my own dataset. However, while I was running the training code, I encountered an issue:
The source training data contains 29080 items, while the target training data contains 3832 items. I set
batch_size
to 256. Therefore, in one of the batch for bothsrc_loader
and thetgt_loader,
other than having 256 as the batch size, the batch size is less than 256 and different (for tgt_loader, the remaining size is 248). Thus, the size ofnegatives_mask
inSimCLRContrastiveLoss
is not compatible with batches of different sizes (e.g. 256 in a source batch, 248 in a target batch). This will cause error indenominator = self.negatives_mask * torch.exp(similarity_matrix / self.temperature)
. Can I ask how you tackled this issue? Thanks!