Closed YLM432423 closed 8 months ago
Why is it normal during training, but the video memory will explode during verification?
Hello @YLM432423 , how do you solve that? I'm running test "dist_train.sh" script and getting same error.
The reason is that there is a data enhancement operation during the verification process, which results in an increase in the memory usage of the graphics card. You can delete the data augmentation action. This does not affect the validity of the model at the time of validation
Hello, do you train the model with a single 4090 and 24G video memory, and can a single card be trained for depth estimation tasks? If you can, please reply
Hello, I ran the training of depth estimation on the 4090. I faced torch.cuda.OutOfMemoryError: CUDA out of memory
When I use
model = torch.nn.parallel.DataParallel(model, device_ids=[args.gpu])
the code can be trained on a 4090(batch_size=3). However, there is still an issue of memory exceeding when verifying。