Closed Qubitium closed 4 months ago
Add notes to assist model developers to more efficiently train/fine-tune Gemma by re-using the reserved tokens rather than attempting to resizing the BPE based tokenizer.
ref: https://github.com/google/gemma_pytorch/issues/12
@pengchongjin @suryabhupa
Cool, thanks!
Add notes to assist model developers to more efficiently train/fine-tune Gemma by re-using the reserved tokens rather than attempting to resizing the BPE based tokenizer.
ref: https://github.com/google/gemma_pytorch/issues/12
@pengchongjin @suryabhupa