NVIDIA / flownet2-pytorch

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Other
3.08k stars 738 forks source link

Error when training Flownet2S #283

Closed Isabel-jin closed 1 year ago

Isabel-jin commented 1 year ago

running

python main.py --batch_size 4\ 
--model FlowNet2S\
 --loss=MultiScale --loss_norm=L1 \
--optimizer=Adam --optimizer_lr=1e-4 \
--training_dataset ImagesFromFolder --training_dataset_root ./datasets/MPI-Sintel/training/final/alley_1  \
--resume ./pre_train/FlowNet2-S_checkpoint.pth.tar \
--total_epochs=10 --skip_validation

but get

Parsing Arguments
  [0.021s] batch_size: 4
  [0.021s] crop_size: [256, 256]
  [0.021s] fp16: False
  [0.021s] fp16_scale: 1024.0
  [0.021s] gradient_clip: None
  [0.021s] inference: False
  [0.021s] inference_batch_size: 1
  [0.021s] inference_dataset: MpiSintelClean
  [0.021s] inference_dataset_replicates: 1
  [0.021s] inference_dataset_root: ./MPI-Sintel/flow/training
  [0.021s] inference_n_batches: -1
  [0.021s] inference_size: [-1, -1]
  [0.021s] inference_visualize: False
  [0.021s] log_frequency: 1
  [0.021s] loss: MultiScale
  [0.021s] loss_l_weight: 0.32
  [0.021s] loss_norm: L1
  [0.021s] loss_numScales: 5
  [0.021s] loss_startScale: 4
  [0.021s] model: FlowNet2S
  [0.021s] model_batchNorm: False
  [0.022s] model_div_flow: 20
  [0.022s] name: run
  [0.022s] no_cuda: False
  [0.022s] number_gpus: 1
  [0.022s] number_workers: 8
  [0.022s] optimizer: Adam
  [0.022s] optimizer_amsgrad: False
  [0.022s] optimizer_betas: (0.9, 0.999)
  [0.022s] optimizer_eps: 1e-08
  [0.022s] optimizer_lr: 0.0001
  [0.022s] optimizer_weight_decay: 0
  [0.022s] render_validation: False
  [0.022s] resume: ./pre_train/FlowNet2-S_checkpoint.pth.tar
  [0.022s] rgb_max: 255.0
  [0.022s] save: ./work
  [0.022s] save_flow: False
  [0.022s] schedule_lr_fraction: 10
  [0.022s] schedule_lr_frequency: 0
  [0.022s] seed: 1
  [0.022s] skip_training: False
  [0.022s] skip_validation: True
  [0.022s] start_epoch: 1
  [0.022s] total_epochs: 10
  [0.022s] train_n_batches: -1
  [0.022s] training_dataset: ImagesFromFolder
  [0.022s] training_dataset_iext: png
  [0.022s] training_dataset_replicates: 1
  [0.022s] training_dataset_root: ./datasets/MPI-Sintel/training/final/alley_1
  [0.022s] validation_dataset: MpiSintelClean
  [0.022s] validation_dataset_replicates: 1
  [0.022s] validation_dataset_root: ./MPI-Sintel/flow/training
  [0.023s] validation_frequency: 5
  [0.023s] validation_n_batches: -1
  [0.025s] Operation finished

Source Code
  Current Git Hash: b'3c0569b70d9bc068a48d0d8658a2178057825aa4'

Initializing Datasets
  [0.033s] Training Dataset: ImagesFromFolder
  [0.081s] Training Input: [3, 2, 256, 256]
  [0.133s] Training Targets: [3, 2, 256, 256]
training
  [0.134s] Operation finished

Building FlowNet2S model
  [0.419s] Effective Batch Size: 4
  [0.420s] Number of parameters: 38676506
  [0.420s] Initializing CUDA
  [2.100s] Parallelizing
  [2.102s] Loading checkpoint './pre_train/FlowNet2-S_checkpoint.pth.tar'
  [2.208s] Loaded checkpoint './pre_train/FlowNet2-S_checkpoint.pth.tar' (at epoch 0)
  [2.208s] Initializing save directory: ./work
  [2.218s] Operation finished

Initializing Adam Optimizer
  [0.000s] amsgrad = False (<class 'bool'>)
  [0.000s] weight_decay = 0 (<class 'int'>)
  [0.000s] eps = 1e-08 (<class 'float'>)
  [0.001s] betas = (0.9, 0.999) (<class 'tuple'>)
  [0.001s] lr = 0.0001 (<class 'float'>)
  [0.001s] Operation finished

Training Epoch 0:   0%|                                                                        | 0/12.0 [00:00<?, ?it/s]
Overall Progress:   0%|                                                      | 0/11 [00:01<?, ?it/s]2.0 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "main.py", line 442, in <module>
    train_loss, iterations = train(args=args, epoch=epoch, start_iteration=global_iteration, data_loader=train_loader, model=model_and_loss, optimizer=optimizer, logger=train_logger, offset=offset)
  File "main.py", line 272, in train
    losses = model(data[0], target[0])
  File "/home/isabel/miniconda3/envs/SmogDetect/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/isabel/miniconda3/envs/SmogDetect/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 153, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/isabel/miniconda3/envs/SmogDetect/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "main.py", line 177, in forward
    loss_values = self.loss(output, target)
  File "/home/isabel/miniconda3/envs/SmogDetect/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/../project/SmogDetect/flownet2-pytorch/losses.py", line 79, in forward
    target_ = self.multiScales[i](target)
  File "/home/isabel/miniconda3/envs/SmogDetect/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/isabel/miniconda3/envs/SmogDetect/lib/python3.6/site-packages/torch/nn/modules/pooling.py", line 554, in forward
    self.padding, self.ceil_mode, self.count_include_pad, self.divisor_override)
RuntimeError: non-empty 3D or 4D (batch mode) tensor expected for input

Thanks in advance!

Isabel-jin commented 1 year ago

Solved. Dataloader ‘ImageFromFolder' has problem in getting in flows.