alexklwong / calibrated-backprojection-network

PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)
Other
117 stars 24 forks source link

Error while training #14

Closed rakshith95 closed 2 years ago

rakshith95 commented 2 years ago

Hello, While running the training script train_kbnet_void1500.sh, I inevitably end up getting this error after 5000+ steps:

Begin training...
Step=  1000/67350  Loss=1.33181  Time Elapsed=0.15h  Time Remaining=10.22h
Step=  2000/67350  Loss=1.39467  Time Elapsed=0.31h  Time Remaining=9.97h
Step=  3000/67350  Loss=1.27011  Time Elapsed=0.46h  Time Remaining=9.78h
Step=  4000/67350  Loss=1.13345  Time Elapsed=0.61h  Time Remaining=9.62h
Step=  5000/67350  Loss=1.04204  Time Elapsed=0.76h  Time Remaining=9.49h
Traceback (most recent call last):
  File "src/train_kbnet.py", line 251, in <module>
    n_thread=args.n_thread)
  File "/home/madharak/ws/calibrated-backprojection-network/src/kbnet.py", line 524, in train
    log_path=log_path)
  File "/home/madharak/ws/calibrated-backprojection-network/src/kbnet.py", line 571, in validate
    for idx, (inputs, ground_truth) in enumerate(zip(dataloader, ground_truths)):
  File "/home/madharak/anaconda3/envs/depth_completion/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
    return self._process_data(data)
  File "/home/madharak/anaconda3/envs/depth_completion/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
    data.reraise()
  File "/home/madharak/anaconda3/envs/depth_completion/lib/python3.7/site-packages/torch/_utils.py", line 385, in reraise
    raise self.exc_type(msg)
TypeError: function takes exactly 5 arguments (1 given)

Have you encountered this error before, and do you have an idea of why it could be occurring? I guess it must be something to do with my data, but I have no idea what. I'm running torch 1.3.0, and torchvision 0.4.1

rakshith95 commented 2 years ago

This seemed to be an issue with loading the intrinsics npy file, with loadtxt failing sometimes. I added a try except with try: np.loadtxt() except: np.load() in datasets.py, and it appears to have fixed the problem.

numpy version 1.21.5