facebookresearch / d2go

D2Go is a toolkit for efficient deep learning
Apache License 2.0
826 stars 197 forks source link

Add preemption checkpointing to lightning tasks #666

Closed frabu6 closed 2 weeks ago

frabu6 commented 2 weeks ago

Summary: While debugging elevated preemption wastage in d2go, came across a few long running Pinocchio jobs in d2go that do not checkpoint preemption and also do not have checkpointing instrumented. This diff addresses both of these issues.

Differential Revision: D58669254

facebook-github-bot commented 2 weeks ago

This pull request was exported from Phabricator. Differential Revision: D58669254

facebook-github-bot commented 2 weeks ago

This pull request was exported from Phabricator. Differential Revision: D58669254

facebook-github-bot commented 2 weeks ago

This pull request was exported from Phabricator. Differential Revision: D58669254

facebook-github-bot commented 2 weeks ago

This pull request has been merged in facebookresearch/d2go@8eab506b12436bb354be9f28820bdde913127fbb.