Closed JoaoGuibs closed 2 months ago
Hi @JoaoGuibs, thank you for reaching out. I think it is expected and delete index itself can't reclaim the resource. You should clean up the torch.cuda. Here is an example I run and I could clean up the memory usage
import torch
import gc
gc.collect()
torch.cuda.empty_cache()
Before
After
Thanks for the reply @junjieqi .
I realised I did not have added the line import faiss.contrib.torch_utils
otherwise it errors with a different exception. Nonetheless I cannot reproduce your results by adding the garbage collection neither the cache clearing of torch, as there is always memory getting allocated (only if I kill the python process the memory gets cleared).
Also I note from your second screenshot that you do not have a python process running on the gpu. Are you able to not have any memory usage while the python process is running as well?
@JoaoGuibs I think it is more related with python lifecycle management instead of the Faiss, as you can tell if you kill the python process, the memory will get cleared. I'm not sure if the python process will continue showing in the nvidia-smi command or not. I thought it only showed when the process was ongoing. However, after I finished the run, it should not show again.
This issue is stale because it has been open for 7 days with no activity.
This issue was closed because it has been inactive for 7 days since being marked as stale.
Summary
When using faiss on the GPU (flat index in this example), if we have an GPU OOM error, even after resetting and deleting the index, we still have some GPU memory being used (nvidia-smi screenshot below). Is this expected? If yes what are the reasons for this to happen?
Thanks in advance.
Platform
OS:
Faiss version: faiss-gpu, version 1.7.2
Installed from: pip
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
While running the following code, the memory usage on the breakpoint is shown below: