Closed WouterBesse closed 2 years ago
Okay, I figured it out for now. I put in the CPU as inference device so it can use my normal RAM. Seems to do the trick.
For anybody who wants to do this as well, don't forget to add map_location=device
to both of the torch.load()
functions in the predict()
function from inference.py.
Hello, I was trying to train my own model with this algorithm but I ran across a problem when trying to use inference with this self trained model:
RuntimeError: CUDA out of memory. Tried to allocate 2.32 GiB (GPU 0; 10.92 GiB total capacity; 8.45 GiB already allocated; 1.80 GiB free; 8.48 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CON
It seems to be a cuda memory problem. First thinking I might not have enough memory on my personal GPU, I tried my university server which hosts 8 GPU's (the earlier log is from that server), which ran into the same problem. A google Colab Pro GPU with 15GB of vram also reported this same problem.
I've tried setting different batch sizes in params.py to see if that would solve the problem, but I can't find any place in the model.py or inference.py that this gets called and it also doesn't seem to affect the memory usage.
I've tried adding
torch.cuda.empty_cache()
in multiple places to see if that could help, but sadly it didn't.So far in the code I can't find anything that would cause this problem. Does anyone else experience the same problem, or is there a solution or setting I'm not seeing?
Could this perhaps be a training problem as well, which means I should train in a lower batch size to keep the inference easier?
I'll add my parameters as well, just in case that's of any help: