First , in train and val ,I think it better to add model.train() and model.eval(). It may not make a difference in VGG networks, but it should be necessary when network have BN and Dropout
second, also in main.py ,in test ,I think you should add with torch.no_grad():, If this code is not added, it will lead to more gradient operations, so that in the test, even on the 8G GPU, batchsize is 1, it cannot run ,becaues out of CUDA memory
When I read your code, I found some errors
First , in train and val ,I think it better to add model.train() and model.eval(). It may not make a difference in VGG networks, but it should be necessary when network have BN and Dropout
second, also in main.py ,in test ,I think you should add with torch.no_grad():, If this code is not added, it will lead to more gradient operations, so that in the test, even on the 8G GPU, batchsize is 1, it cannot run ,becaues out of CUDA memory