facebookresearch / metaseq

Repo for external large-scale work
MIT License
6.51k stars 726 forks source link

Andy/drop mseq req from reshard #715

Closed andrewPoulton closed 1 year ago

andrewPoulton commented 1 year ago

The tokenization_cache in the train_iterator state_dict needs to be unpickled in a metaseq-aware env.

Adding a flag to remove this so consolidated checkpoints load elsewhere.

Also fixes bug in named argument

suchenzang commented 1 year ago

Was testing if rebasing on main could resolve gpu test failures, which apparently adds me to all commits :( Nothing here should affect gpu tests so this is an odd failure...