mli0603 / stereo-transformer

Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective with Transformers. (ICCV 2021 Oral)
Apache License 2.0
659 stars 107 forks source link

Image size #64

Closed rebecca0011 closed 2 years ago

rebecca0011 commented 2 years ago

Does the code have any requirements for image size? It has to be a multiple of 32 or 64?

rebecca0011 commented 2 years ago

I use mu own data to eval anf finetune but got this error,image size is 960600: `(tf) rc@rc:~/StereoMatching/stereo-transformer$ CUDA_VISIBLE_DEVICES=0 python main.py --epochs 100 --batch_size 1 --checkpoint UAV_ft --num_workers 2 --dataset UAV --dataset_directory /home/rc/StereoMatching/Dataset/UAV/training-960600/ --ft --resume /home/rc/StereoMatching/stereo-transformer/run/sceneflow/pretrain/experiment_10/epoch_9_model.pth.tar number of params in backbone: 1,050,800 number of params in transformer: 797,440 number of params in tokenizer: 503,728 number of params in regression: 161,843 Pre-trained model successfully loaded. Start training Epoch: 0 0%| | 0/289 [00:00<?, ?it/s]/home/rc/anaconda3/envs/tf/lib/python3.9/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floordivide(self, other) 0%| | 0/289 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/rc/StereoMatching/stereo-transformer/main.py", line 265, in main(args) File "/home/rc/StereoMatching/stereo-transformer/main.py", line 235, in main train_one_epoch(model, data_loader_train, optimizer, criterion, device, epoch, summary_writer, File "/home/rc/StereoMatching/stereo-transformer/utilities/train.py", line 32, in train_oneepoch , losses, sampled_disp = forward_pass(model, data, device, criterion, train_stats) File "/home/rc/StereoMatching/stereo-transformer/utilities/foward_pass.py", line 55, in forward_pass outputs = model(inputs) File "/home/rc/anaconda3/envs/tf/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, *kwargs) File "/home/rc/StereoMatching/stereo-transformer/module/sttr.py", line 103, in forward output = self.regression_head(attn_weight, x) File "/home/rc/anaconda3/envs/tf/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(input, **kwargs) File "/home/rc/StereoMatching/stereo-transformer/module/regression_head.py", line 252, in forward output['gt_response'], target = self._compute_gt_location(scale, x.sampled_cols, x.sampled_rows, File "/home/rc/StereoMatching/stereo-transformer/module/regression_head.py", line 90, in _compute_gtlocation , _, w = disp.size() ValueError: too many values to unpack (expected 3) `