Open transmissions11 opened 1 year ago
https://github.com/jzhang38/TinyLlama/blob/0fcf9b61130f189b78747b0b013262c72f01286a/pretrain/tinyllama.py#L199C8-L208
looks like lightning actually makes this pretty easy? https://lightning.ai/docs/pytorch/stable/common/checkpointing_basic.html#resume-training-state
https://github.com/jzhang38/TinyLlama/blob/0fcf9b61130f189b78747b0b013262c72f01286a/pretrain/tinyllama.py#L199C8-L208