Closed yang-su2000 closed 3 months ago
Thanks for the PR! I will try to take a look at it this week and provide comments.
Thank you, great that it could help!
Generally LGTM! Thanks again for the PR.
Not super necessary but have you considered sub-classing the original GradCache
class to reduce code duplication (in the name of software engineering best practice) ? If this is too convoluted / too much work, we can pull the code as is.
nit: pl_gradcache.py:360 the nullcontext
probably is not necessary
Thanks for the advice!
Not super necessary but have you considered sub-classing the original
GradCache
class to reduce code duplication (in the name of software engineering best practice) ? If this is too convoluted / too much work, we can pull the code as is.
Good suggestion, I did some work to make PLGradCache
a subclass of GradCache
now, and they passed sanity tests.
nit: pl_gradcache.py:360 the
nullcontext
probably is not necessary
I suppose you mean line 306, yes, it is no longer needed and has been removed.
Nice! If you don't feel like there's anything else to add, I'll go ahead and merge.
Yep, I think it is good to merge!
I developed an integration for GradCache with PyTorch Lightning, supporting Multi-GPU and Mixed-Precision training. The
readme.md
can be used as a reference to run the example code.The integration of GradCache with PyTorch Lightning is very non-trivial and requires quite a few changes to the original framework. I looked into this as one of my research projects relies heavily on PyTorch Lightning and memory-efficient contrastive training, and I hope that this can serve as a valuable reference for future users and researchers.
I'm open to making any necessary adjustments to this PR to align with the standards. I also extend my gratitude to the original authors of GradCache for their foundational work. Your feedback for improvement are welcome and appreciated!