fastnlp / fastNLP

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
https://gitee.com/fastnlp/fastNLP
Apache License 2.0
3.06k stars 450 forks source link

fastNLP.core.utils._CheckError #407

Open heng3366 opened 2 years ago

heng3366 commented 2 years ago

复现falt,fastnlp用0.5.0版本的,python3.8,torch1.7,ubuntu 出现如下错误: Epoch 1/100: 1%|▌ | 955/95600 [01:01<1:24:04, 18.76it/s, loss:56.88514]/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:156: UserWarning: The epoch parameter in scheduler.step() was not necessary and is being deprecated where possible. Please use scheduler.step() to step the scheduler. During the deprecation, if epoch is different from None, the closed form is used instead of the new chainable form, where available. Please open an issue if you are unable to replicate your use case: https://github.com/pytorch/pytorch/issues/new/choose. warnings.warn(EPOCH_DEPRECATION_WARNING, UserWarning) Traceback (most recent call last):
File "flat_main.py", line 801, in trainer.train() File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/trainer.py", line 613, in train self.callback_manager.on_exception(e) File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/callback.py", line 309, in wrapper returns.append(getattr(callback, func.name)(*arg)) File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/callback.py", line 505, in on_exception raise exception # 抛出陌生Error File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/trainer.py", line 609, in train self._train() File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/trainer.py", line 668, in _train loss = self._compute_loss(prediction, batch_y).mean() File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/trainer.py", line 776, in _compute_loss return self.losser(predict, truth) File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/losses.py", line 339, in call loss = self.get_loss(pred_dict) File "/home/ai998/.conda/envs/nlp/lib/python3.7/site-packages/fastNLP/core/losses.py", line 334, in get_loss raise _CheckError(check_res=check_res, func_signature=_get_func_signature(self.get_loss)) fastNLP.core.utils._CheckError: Problems occurred when calling `LossInForward.get_loss(self, kwargs) missing param: ['loss(assign tolossinLossInForward`'] 没想明白怎么loss就丢失了,请问怎么解决

yhcc commented 2 years ago

看报错是由于model返回的dict中没有loss。

JY-Ren commented 2 years ago

我也报了这个错,在每个epoch开始前加上self.model.train(),就跑通了

yhcc commented 2 years ago

这样话推测可能是由于代码forward中有使用self.training这个属性来判断当前是否是inference,如果是self.training为True和为False的时候,走的逻辑不一样。而手动调用self.model.train()应该是将self.training设置为True了。