使用Trainer时遇到的一个错误

fastnlp / fastNLP

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Apache License 2.0

3.05k stars 451 forks source link

在py3.9, torch1.11下，使用Trainer报了一个错误： RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32, 50, 711]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). 使用DataSetIter自定义训练时就不会报错，去网上查了查这个错误的解决方案，大概是inplace的改动导致的，是因为torch版本的问题导致的吗？在高本版torch下如果还想直接使用Trainer而不是自定义训练，该如何解决呢？

fastnlp / fastNLP

使用Trainer时遇到的一个错误 #404