allenai / allenact

An open source framework for research in Embodied-AI from AI2.
https://www.allenact.org
Other
308 stars 49 forks source link

wandb support, callback func for PipelineStage, and cache handling #382

Closed KuoHaoZeng closed 1 month ago

KuoHaoZeng commented 1 month ago
  1. track grad norm before clipping
  2. add sampler_select to remove KV cache corresponding to finished processes at evaluation
  3. support wandb checkpoint upload
  4. support wandb checkpoint download at evaluation and training resume
  5. add callback function to support engine attribute changes in different PipelineStage, for example, change optimizer.
KuoHaoZeng commented 1 month ago

Few updates:

  1. Fix the pytest.yaml by downgrade torchvision to torchvision>=0.7.0,<=0.16.2 in the setup.py.
  2. Add lint in actions for black.
  3. make ckpt saving at every host an option