how to control GPU allocations in high-dimensional pre-sampling?

mkarikom commented 11 months ago

Hi, How can I control how much GPU ram is allocated during pre-sampling? I've noticed that pre-sampling more than 4-5 dimensional categorical needs a lot of memory. For instance, although the 2 and 4 dimensional examples (promoter and bernoulli) run fine, I get the following error when running the sudoku (9-dim) presampling on a 24GB gpu:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.96 GiB. GPU 0 has a total capacty of 23.65 GiB of which 5.27 GiB is free. Process 643229 has 18.34 GiB memory in use. Of the allocated memory 17.89 GiB is allocated by PyTorch, and 9.78 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

PavelAvdeyev commented 11 months ago

Hello,

You can reduce number of samples parameter -n 100000 from 100000 to 10000 (or 1000). After that, you can run presample_noise.py several times if you want more samples. It generates several files. These files contains torch tensors saved with torch.save. You can combine them into one with torch.cat and save it again into single file. As a result, you will have a lot of samples and will not have any problems with GPU.

I hope it helps.

Best, Pavel

mkarikom commented 11 months ago

ahh, that makes sense since we can still take the same number of steps in each batch

thanks!

jzhoulab / ddsm

how to control GPU allocations in high-dimensional pre-sampling? #3