Closed kushalarora closed 1 year ago
One further debugging it seems like this happens when self.config.backward_batch_size > batch_size
.
This can happen when mini_batch_size * gradient_accumulation_steps > batch_size
.
This would result in a scenario where mini_batch_start
in this line (the for loop in the snippet below) would be greater than the batch_size resulting in an empty mini_batch_inds
splice which would result in a mini_batch_dict with empty entries, i.e, mini_batch_dict['queries']
would be an empty tensor.
for mini_batch_start in range(0, self.config.backward_batch_size, self.config.mini_batch_size):
mini_batch_end = mini_batch_start + self.config.mini_batch_size
mini_batch_inds = backward_batch_inds[mini_batch_start:mini_batch_end]
mini_batch_dict = {
"logprobs": batch_dict["logprobs"][mini_batch_inds],
"values": batch_dict["values"][mini_batch_inds],
"masks": batch_dict["masks"][mini_batch_inds],
@vwxyzjn Seems like this was introduced in refactoring that was done in #546.
Ah, thanks for the catch. In this case, we should add a check ensuring self.config.backward_batch_size > batch_size
. Gonna prepare a PR tomorrow for this.
PPOTrainer throws the following error when passed argument --gradient_accumulation_steps >=2.
TRL version: 0.5.1.dev0
Related issue here: https://github.com/huggingface/trl/issues/614