Closed rnlee1998 closed 1 year ago
Hi, thank you for your interest in our work.
It seems your training converged to a local minimum. I could reproduce similar results that were reported in the paper both using TITAN Xp (CUDA11.0, PyTorch 1.7.1) and NVIDIA A40 (CUDA 11.3, PyTorch 1.12.1).
You can also try to set --lr
as [0.0001, 5e-6, 16, 0.0001, 1e-5, 16]
, then the learning rate will be reset at the 16th epoch. This may allow the network to jump out of the local minima. If you do this the best result might be achieved at epoch 16 or 17. So, please check all the epochs.
Thanks for your reply and I wander the python=3.8 or 3.9 ? because I got the warning:
UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
dim_t = self.temperature ** (2 * (dim_t//2) / self.hidden_dim)
Should I ignore the warning or fix it?
I used Python 3.7.4 and 3.8.13, respectively. This is a warning from PyTorch
and you can ignore it if you use 1.7.1
or 1.12.1
.
I'm now closing this issue due to no response. Please feel free to reopen it if you still have questions.
I appreciate your work very much, and I have reproduced your code. The litemono model I trained is similar to the reported result, but the litemono-8m is worse than your reported. What is the reason? If possible, can you provide your training environment?
my litemono-8m result: abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.104 & 0.769 & 4.533 & 0.181 & 0.893 & 0.964 & 0.983 \