Open WoojuLee24 opened 4 months ago
I tried using the SIBCL Kitti weights from SIBCL's repository, copied encoder, decoder, and adaptation weight, but didn't end up with a good result with first epoch. I checked out the provided weight contents, and it seems like it was trained for 1170365 steps, does it take as many steps to converge?
Yes, the training process can be quite slow, and sometimes there is an unexpected performance drop at a particular step. In such cases, I stop the training and restart from the last saved good weights, which helps to avoid this problem.
Hello, thank you for your work. I am reproducing PureACL and I have some questions.
GPU memory I experimented on the RTX A5000 (24G) with the default setting: batch size 3, Adam optimizer with a learning rate of 10^(-4) But, the GPU memory is exceeded even with a batch size of 2. The GPU memory fluctuates from 14G to over 24G. I am curious about how the batch size was set. (For example, whether Distributed Data-Parallel (DDP) was used or not, 3 GPUs were used with a batch size of 1 per GPU..)
In the paper, The feature extractor weights of PureACL were initialized with the pre-trained weights from SIBCL. Could you provide the experimental results of PureACL without pre-trained weights of SIBCL?
Thank you!