dmburd / S-DCNet

Unofficial Pytorch implementation of S-DCNet and SS-DCNet
17 stars 9 forks source link

error encounting when change the training batch size. #9

Open alexsun009 opened 4 years ago

alexsun009 commented 4 years ago

Hi, I encounting an error when increase the batch size, example(1>4); but training with batch size=1 seems ok at the moment.

what might possible wrong about this? thank you ] Traceback (most recent call last): File "train.py", line 359, in main() File "/home/alex/.local/lib/python2.7/site-packages/hydra/main.py", line 24, in decorated_main strict=strict, File "/home/alex/.local/lib/python2.7/site-packages/hydra/_internal/utils.py", line 174, in run_hydra overrides=args.overrides, File "/home/alex/.local/lib/python2.7/site-packages/hydra/_internal/hydra.py", line 86, in run job_subdir_key=None, File "/home/alex/.local/lib/python2.7/site-packages/hydra/plugins/common/utils.py", line 109, in run_job ret.return_value = task_function(task_cfg) File "train.py", line 355, in main trainer.train() File "train.py", line 241, in train upsampling_loss = loss.upsampling_loss(sample_counts_gt, U1, U2) File "/home/alex/crowdcount/SDCNET-withtrain/S-DCNet/loss.py", line 31, in upsampling_loss U1_gt = count1_gt / F.conv_transpose2d(count0_gt, krn, stride=2) RuntimeError: Given transposed=1, weight of size 4 1 2 2, expected input[4, 1, 6, 8] to have 4 channels, but got 1 channels instead

alexsun009 commented 4 years ago

As an added on, looks this issue only happens with supervised:Ture(SS-DCNet), and training the S-DCNet version do not have such issue.

supervised: False True for the Supervised S-DCNet (SS-DCNet) False for the older version (unsupervised, ordinary S-DCNet)

Thanks

dmburd commented 4 years ago

Hi, essentially, only batch size == 1 is supported. The reasons for that are: 1) The input images have different resolutions; 2) After the augmentations that change the input sample resolution (QuasiRandomCrop and PadToMultipleOf64 in ShanghaiTech_dataset.py) the samples still have different resolutions. No preprocessing is done in order to make the samples have the same resolution. It would complicate preprocessing and significantly reduce the meaningful part of a sample area.

Consider a case when you have two input images, the 1st one has landscape orientation, and the 2nd one has portrait orientation. How can we make them have the same resolution in order to put them into the same batch? Simple resizing that changes aspect ratio is not allowed (because of the nature of the problem -- the network should always see the same natural aspect ratio of people's bodies and heads). The only option would be to choose an excessive frame size that would cover both input images simultaneously and pad the images by zeros to fit them to the chosen excessive frame size. In that case, the meaningful parts of the images become relatively small and the network forward pass time increases. I decided not to deal with such negative effects. This issue can be even more pronounced if there are, say, 4 images that you want to put into the same batch.