The CLOOB paper mentioned that it used CUML-based logistic regression with L-BFGS algorithm to utilize GPUs for efficiency. My implementation works fine on small datasets (e.g., CIFAR), but CUDA out of memory occurred when dealing with large-scale ImageNet.
I have been stuck here for a pretty long time, and I cannot find useful support from the document or the Internet. Is it possible to provide a few code examples highlighting how to fix this problem?
The CLOOB paper mentioned that it used CUML-based logistic regression with L-BFGS algorithm to utilize GPUs for efficiency. My implementation works fine on small datasets (e.g., CIFAR), but CUDA out of memory occurred when dealing with large-scale ImageNet.
I have been stuck here for a pretty long time, and I cannot find useful support from the document or the Internet. Is it possible to provide a few code examples highlighting how to fix this problem?