Just wanted to thank you for your great work! I used GradCache to build state of the art sentence embeddings (https://arxiv.org/abs/2202.08904). Thanks to GradCache, I could scale up batch sizes from 48 to 1024 for the model trained on NLI improving its average performance on USEB by 4%~
Just wanted to thank you for your great work! I used GradCache to build state of the art sentence embeddings (https://arxiv.org/abs/2202.08904). Thanks to GradCache, I could scale up batch sizes from 48 to 1024 for the model trained on NLI improving its average performance on USEB by 4%~