Edward-Sun / easy-to-hard

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
https://arxiv.org/abs/2403.09472
BSD 3-Clause "New" or "Revised" License
95 stars 10 forks source link

How do I convert the PPO trained model (.pt) into hf format ? #10

Closed supermancmk closed 1 month ago

supermancmk commented 2 months ago

How do I convert the PPO trained model (.pt) into hf format?

I tried to use this file to convert using. The following command:


python scripts/convert_checkpoint_to_hf.py \
    --tp_ckpt_name xxxx/path_to_ppo_model \
    --tokenizer_name EleutherAI/llemma_7b \
    --pretrain_name EleutherAI/llemma_7b \
    --save_name_hf xxxx/path_to_save

But I got the following error:

ValueError: Invalid key: tok_embeddings.weight

Also, can you provide the command to evaluate MATH?

Edward-Sun commented 1 month ago

Which line is the error reported?

Also, I'm not sure path_to_ppo_model should contain the tok_embeddings since we made them frozen in this framework.