kakaobrain / torchgpipe

A GPipe implementation in PyTorch
https://torchgpipe.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
800 stars 98 forks source link

[Question] What is the purpose of "always" checkpointing mode? #24

Open pritamdamania87 opened 3 years ago

pritamdamania87 commented 3 years ago

The user guide mentions the following:

Usually, checkpointing at the last micro-batch may not be useful because the saved memory will be reconstructed immediately. That’s why we choose 'except_last' as the default option.

It seems like using the mode "always" might not have any benefit, are there some cases where it makes sense to use "always" instead of "exception_last"?

sublee commented 3 years ago

I agree that "always" might have no benefit for most cases. However, for a special case chunks=1, it is useful to compare performance with and without checkpointing, because every chunk is the last chunk if there is only one chunk.