Open khatwanimohit opened 1 month ago
Wouldnt also Rope Theta scaling need to be implemented for Llama 3.1 to work correctly? Like done in HF: https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/48d6d0fc4e02fb1269b36940650a1b7233035cbb/config.json#L21.
Or am I missing something here?
@khatwanimohit I am testing the script for converting the Meta-checkpoints. Everything looks fine. However, for some reason the file scanned_chkpt/0/items/checkpoint
is not written. This seems just to be a file for the state. The model weights seems to be stored in the bucket.
UPDATE: This seems to be just a status file, and since this is at checkpoint 0, it does not seem to matter. I can manually copy the file from Llama3, to fix this.
Hi, what's the eta of the PR? Wanted to test the models on MaxText
Tested: http://shortn/_TVtieLHb4u http://shortn/_iIa7Kkdcj7