Open PhilipMay opened 2 years ago
I guess all_embeddings = torch.stack(all_embeddings)
should be done on CPU and not on GPU?
Putting this before the "stack" might fix the bug: all_embeddings = [e.cpu() for e in all_embeddings]
.
@nreimers the solution above works for me and fixes the issue. I am not 100% sure of the side effects. Is it ok to move all tensors in the list from GPU to CPU?
What do you think? Should I create a PR?
Many thanks Philip
Hi @PhilipMay sadly it has side effects and it is unclear if you want to have this or not (or: It depends on the use-case if you want to have it or not).
If your GPU has enough memory, you want to keep the tensors on the GPU, because:
So you only want this line if you run OOM. So maybe some option would be needed.
Also torch.stack currently doubles the need for memory, as it has at some time all old tensors and the new tensors.
Maybe a better solution would be to create the final matrix up-front in the encode method and to write the generated embeddings to this result matrix? Then we wouldn't have overhead of duplicating all embeddings.
@nreimers the solution above works for me and fixes the issue. I am not 100% sure of the side effects. Is it ok to move all tensors in the list from GPU to CPU?
What do you think? Should I create a PR?
Many thanks Philip
Hey @PhilipMay! Thank you for providing the fix.
I was wondering whether you encountered an issue like that after using this solution: the task is finished according to the progress bar, but it's still running in Jupyter (having an asterisk)?
I was wondering whether you encountered an issue like that after using this solution: the task is finished according to the progress bar, but it's still running in Jupyter (having an asterisk)?
No. I can not remember something like that.
Hi,
I am using
util.paraphrase_mining
on 3,463,703 sentences and a 16 GB GPU:I am getting a CUDA out of memory error:
I am using a model based on
xlm-r-distilroberta-base-paraphrase-v1
and the folling packages: