Open Yang-bug-star opened 8 months ago
You can try to turn up the gradient_accumulation_steps, however, this was never tested. We recommend fintuning our published models.
gradient_accumulation_steps
You can try to turn up the
gradient_accumulation_steps
, however, this was never tested. We recommend fintuning our published models.