yzxing87 / Invertible-ISP

[CVPR2021] Invertible Image Signal Processing
MIT License
338 stars 38 forks source link

Using a target size that is different to the input size #14

Closed xunmeibuyue closed 2 years ago

xunmeibuyue commented 2 years ago

I'm trying to train the model on another dataset. But I have encountered the following problem:

Parsed arguments: Namespace(aug=True, batch_size=1, camera='Canon1DsMkIII', data_path='/data/lly/inv_isp_data/', debug_mode=False, gamma=True, loss='L1', lr=0.0001, out_path='/data/lly/inv_isp_data/Canon1DsMkIII/', resume=False, rgb_weight=1, task='debug')
[INFO] Start data loading and preprocessing
[INFO] Start to train
task: debug Epoch: 0 Step: 0 || loss: 0.46242 raw_loss: 0.10383 rgb_loss: 0.35858 || lr: 0.000100 time: 0.316538
task: debug Epoch: 0 Step: 1 || loss: 0.24662 raw_loss: 0.01957 rgb_loss: 0.22705 || lr: 0.000100 time: 0.270781
task: debug Epoch: 0 Step: 2 || loss: 0.05458 raw_loss: 0.00540 rgb_loss: 0.04919 || lr: 0.000100 time: 0.269678
task: debug Epoch: 0 Step: 3 || loss: 0.12149 raw_loss: 0.00757 rgb_loss: 0.11392 || lr: 0.000100 time: 0.269641
task: debug Epoch: 0 Step: 4 || loss: 0.17164 raw_loss: 0.00870 rgb_loss: 0.16295 || lr: 0.000100 time: 0.282781
task: debug Epoch: 0 Step: 5 || loss: 0.09719 raw_loss: 0.00595 rgb_loss: 0.09124 || lr: 0.000100 time: 0.277356
task: debug Epoch: 0 Step: 6 || loss: 0.08278 raw_loss: 0.00824 rgb_loss: 0.07454 || lr: 0.000100 time: 0.276587
task: debug Epoch: 0 Step: 7 || loss: 0.08254 raw_loss: 0.00801 rgb_loss: 0.07453 || lr: 0.000100 time: 0.279638
task: debug Epoch: 0 Step: 8 || loss: 0.11994 raw_loss: 0.01274 rgb_loss: 0.10720 || lr: 0.000100 time: 0.270859
task: debug Epoch: 0 Step: 9 || loss: 0.07166 raw_loss: 0.00605 rgb_loss: 0.06562 || lr: 0.000100 time: 0.287317
task: debug Epoch: 0 Step: 10 || loss: 0.19911 raw_loss: 0.00554 rgb_loss: 0.19357 || lr: 0.000100 time: 0.272710
task: debug Epoch: 0 Step: 11 || loss: 0.14320 raw_loss: 0.00622 rgb_loss: 0.13698 || lr: 0.000100 time: 0.279719
task: debug Epoch: 0 Step: 12 || loss: 0.05994 raw_loss: 0.00999 rgb_loss: 0.04996 || lr: 0.000100 time: 0.282813
task: debug Epoch: 0 Step: 13 || loss: 0.04691 raw_loss: 0.00428 rgb_loss: 0.04263 || lr: 0.000100 time: 0.269908
task: debug Epoch: 0 Step: 14 || loss: 0.09645 raw_loss: 0.00515 rgb_loss: 0.09129 || lr: 0.000100 time: 0.287600
task: debug Epoch: 0 Step: 15 || loss: 0.08834 raw_loss: 0.00427 rgb_loss: 0.08407 || lr: 0.000100 time: 0.288736
train.py:69: UserWarning: Using a target size (torch.Size([1, 3, 0, 256])) that is different to the input size (torch.Size([1, 3, 256, 256])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  rgb_loss = F.l1_loss(reconstruct_rgb, target_rgb)
Traceback (most recent call last):
  File "train.py", line 98, in <module>
    main(args)
  File "train.py", line 69, in main
    rgb_loss = F.l1_loss(reconstruct_rgb, target_rgb)
  File "/home/amax/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2633, in l1_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "/home/amax/anaconda3/lib/python3.8/site-packages/torch/functional.py", line 71, in broadcast_tensors
    return _VF.broadcast_tensors(tensors)  # type: ignore
RuntimeError: The size of tensor a (256) must match the size of tensor b (0) at non-singleton dimension 2

I have searched for this problem on Google and stackoverflow, but the answers only mention that it maybe the wrong output dimension of certain layers.

So are there any fixed parameters of image size in this code? Would you mind having a look at it and pointing out the problem? Thanks!

yzxing87 commented 2 years ago

Hi, I think the issue may be caused by some broken images in your dataset. Thus I recommend checking your target dataset first.

xunmeibuyue commented 2 years ago

Thanks, perhaps because I used the camera JPEG outputs as rendered sRGB image.

Will feedback soon.

xunmeibuyue commented 2 years ago

You are right. I find it is caused by wrong flip of some images. If training model using other datasets, the flip operation in this line:https://github.com/yzxing87/Invertible-ISP/blob/344dd333dd2a075f6a9e4ffc445dc387ca3014c4/data/data_preprocess.py#L53 should closely follow the operation of getting raw sensor data, i.e. this line:https://github.com/yzxing87/Invertible-ISP/blob/344dd333dd2a075f6a9e4ffc445dc387ca3014c4/data/data_preprocess.py#L49

Anyway, Thanks!