bytedance / ParaGen

ParaGen is a PyTorch deep learning framework for parallel sequence generation.
Other
185 stars 24 forks source link

fix bug of save_last_k_models #17

Closed zjersey closed 2 years ago

zjersey commented 2 years ago

bug before: save(state_dict) is before pop(0), so the length of saved _save_last is k+1 instead of k.