luyug / GradCache

Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint
Apache License 2.0
327 stars 19 forks source link

Great work! Helped creating sota embeddings #10

Closed Muennighoff closed 2 years ago

Muennighoff commented 2 years ago

Just wanted to thank you for your great work! I used GradCache to build state of the art sentence embeddings (https://arxiv.org/abs/2202.08904). Thanks to GradCache, I could scale up batch sizes from 48 to 1024 for the model trained on NLI improving its average performance on USEB by 4%~