codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.09k stars 1.29k forks source link

Why not use torch.no_grad when evaluating test data? #92

Open EvanZ opened 3 years ago

EvanZ commented 3 years ago

The way the trainer is set up the iteration that is used for train and test is similar except when train step is run the backwards propagation occurs. But one other thing I typically see different between test and train is that in the test batch with torch.no_grad() is used so that, for example, dropout is not applied. Was there any reason this isn't used here?

Guo-Stone commented 1 year ago

I think it should use torch.no_grad(). Or it will run out of GPU memory.