fastnlp / fastNLP

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
https://gitee.com/fastnlp/fastNLP
Apache License 2.0
3.05k stars 451 forks source link

训练的逻辑 #318

Closed laomagic closed 3 years ago

laomagic commented 3 years ago

模型训练的过程 数据集例如:trainDataset 样本总量为10000,train过程每个epoch的数据怎么是batch_size=16的尺寸,并且每个batch_size验证集验证一次(设置的参数是每个epoch验证一次)

使用的版本是fastNLP=0.5.5的版本,v0.5.5的版本和最新的版本_train()的验证集的代码是逻辑不一样, len(self.data_iterator)这块的数据是和batch_size的值一样,不太明白这个训练逻辑(按照epoch或者batch验证都可以,但是这个trainDataset的数据量变成batch_size,没懂是个什么逻辑)

v0.5.5的验证逻辑 ` if ((self.validate_every > 0 and self.step % self.validate_every == 0) or (self.validate_every < 0 and self.step % len(self.data_iterator) == 0)) \ and self.dev_data is not None: eval_res = self._do_validation(epoch=epoch, step=self.step) eval_str = "Evaluation on dev at Epoch {}/{}. Step:{}/{}: ".format(epoch, self.n_epochs, self.step, self.n_steps)

pbar.write(eval_str + '\n')

                    self.logger.info(eval_str)
                    self.logger.info(self.tester._format_eval_results(eval_res)+'\n')`
yhcc commented 3 years ago

如果是每个epoch验证一次的话,应该设置valid_every=-1。len(self.data_iterator)返回的结果是有多少个batch,而step记录的就是batch的index,所以当step可以整除len(self.data_iterator)的时候就是一个epoch结束。