Closed Muennighoff closed 1 year ago
Thanks for the fix! I think we could still have some cases of left stop tokens since we get the start length by batch but this should make it better for datasets with very different prompts sizes.
Thanks for the fix! I think we could still have some cases of left stop tokens since we get the start length by batch but this should make it better for datasets with very different prompts sizes.