ohmeow / blurr

A library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.
https://ohmeow.github.io/blurr
Apache License 2.0
289 stars 34 forks source link

Add GradientCheckpointing callback #68

Closed DanteOz closed 2 years ago

DanteOz commented 2 years ago

Created a fastai callback to enable gradient checkpointing for Huggingface models.

Remaining Tasks:

review-notebook-app[bot] commented 2 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

DanteOz commented 2 years ago

@ohmeow The callback is done and I have added a unit test for the memory consumption. I see memory consumption drop from 8.9 GBs to 3.8 GBs for roberta-large with bs=4.

I'm unfamiliar with nbdev testing and documentation best practice, so I would appreciate feedback on the structure of unit tests.

ohmeow commented 2 years ago

Thanks @DanteOz. I'll review and get back to you on the PR in the next couple of days. Thanks for the welcomed addition to blurr.

DanteOz commented 2 years ago

I'll also update the usage example to add the callback to the Learner upon initialization rather than to the fit methods. This will aid notebook workflows as the user won't have to remember to pass the callback to methods like lr_find().