Reason for copying the tensors from the dataloader to input tensors

sshkhr commented 6 years ago

Hi Aitor. Thanks for the excellent code.

I was trying to make some modifications and use CycleGAN for a project of mine. I am often running into memory issues which I think is due to the copying of tensors from dataloader outputs to the input tensors in train.py

real_A = Variable(input_A.copy_(batch['A']))
real_B = Variable(input_B.copy_(batch['B']))

On watching the GPU usage with smaller batch sizes I notice that the usage increases momentarily and then drops down (probably when doing feed-forward and backprop) which could be the reason behind memory issues with larger batch sizes.

Is there some particular reason behind this copying or can I use the batches from the dataloader directly ?

zhangluustb commented 5 years ago

The same question with you,do you have the answer?

Ar-Kareem commented 1 year ago

I think it's due to the fact that real_A and real_B will be modified/manipulated/change values later on in the code and you don't want those modifications to change your actual data. Otherwise next epoch you'll be seeing the modified values when you iterate through those tensors from your dataset.

aitorzip / PyTorch-CycleGAN

Reason for copying the tensors from the dataloader to input tensors #6