Why the losses do not fall, it has been oscillating.

2579690686 commented 1 year ago

batchsize = 6， thanks

choyingw commented 11 months ago

Hi, could you provide the loss curve or screenshot, and hyperparamters you set

2579690686 commented 11 months ago

Thank you very much for your reply！

python execute.py --exe train --model_name distdepth-distilled --frame_ids 0 -1 1 --log_dir='./tmp' --data_path D:\sim --dataset SimSIN --batch_size 4 --width 256 --height 256 --max_depth 10.0 --num_epochs 10 --scheduler_step_size 8 --learning_rate 0.0001 --thre 0.95 --num_layers 152 --log_frequency 25

I tried adjusting the learning rate, but it still fluctuated slightly.

Due to GPU limitations, my batch size can only go up to 6, will this be affected?

2579690686 commented 11 months ago

Below are the results of my previous training, and the loss has been oscillating in the range of 0.5-0.6. tensor(0.5789, device='cuda:0', grad_fn=) 502305 tensor(0.5753, device='cuda:0', grad_fn=) 502306 tensor(0.5608, device='cuda:0', grad_fn=) 502307 tensor(0.6003, device='cuda:0', grad_fn=) 502308 tensor(0.5488, device='cuda:0', grad_fn=) 502309 tensor(0.5893, device='cuda:0', grad_fn=) 502310 tensor(0.5723, device='cuda:0', grad_fn=) 502311 tensor(0.5935, device='cuda:0', grad_fn=) 502312 tensor(0.5904, device='cuda:0', grad_fn=) 502313 tensor(0.5735, device='cuda:0', grad_fn=) 502314 tensor(0.5512, device='cuda:0', grad_fn=) 502315 tensor(0.5662, device='cuda:0', grad_fn=) 502316 tensor(0.5907, device='cuda:0', grad_fn=) 502317 tensor(0.5671, device='cuda:0', grad_fn=) 502318 tensor(0.5786, device='cuda:0', grad_fn=) 502319 tensor(0.5735, device='cuda:0', grad_fn=) 502320 tensor(0.5926, device='cuda:0', grad_fn=) 502321 tensor(0.5640, device='cuda:0', grad_fn=) 502322 tensor(0.5468, device='cuda:0', grad_fn=) 502323 tensor(0.6017, device='cuda:0', grad_fn=) 502324 tensor(0.5932, device='cuda:0', grad_fn=) 502325 tensor(0.5776, device='cuda:0', grad_fn=) 502326 tensor(0.5854, device='cuda:0', grad_fn=) 502327 tensor(0.5887, device='cuda:0', grad_fn=) 502328 tensor(0.5909, device='cuda:0', grad_fn=) 502329 tensor(0.5675, device='cuda:0', grad_fn=) 502330 tensor(0.5910, device='cuda:0', grad_fn=) 502331 tensor(0.6165, device='cuda:0', grad_fn=) 502332 tensor(0.5913, device='cuda:0', grad_fn=) 502333 tensor(0.5844, device='cuda:0', grad_fn=) 502334 tensor(0.5771, device='cuda:0', grad_fn=) 502335 tensor(0.5866, device='cuda:0', grad_fn=) 502336 tensor(0.5686, device='cuda:0', grad_fn=) 502337 tensor(0.5570, device='cuda:0', grad_fn=)

2579690686 commented 11 months ago

choyingw commented 11 months ago

Thank you for reporting this issue. It is a stabilization issue in distillation loss. I have fixed this and updated the code. You should be able to see some false-colored depth even with the model at the first epoch (just for sanity check). It should work for batch size = 4, but I found the loss may fluctuate more with a lower batch size.

2579690686 commented 11 months ago

Thank you very much, the model is already convergent.

Thanks again.

facebookresearch / DistDepth

Why the losses do not fall, it has been oscillating. #25