Closed Fettpet closed 6 years ago
Thanks for pointing it out. If you are using the code within Tensoflow that is not a problem because the memory allocation is handled by Tensorflow. This is only a problem if you are using this as a standalone package. I will fix it when I get the time or if you would like to help me out please submit a pull request.
Best,
Miguel
Hey Miguel, I like to help you. I create a pull request in the next days. I use the Permutohedral Lattice for a pytorch backend.
greets Sebastian
Thanks,
I don't know how the pytorch GPU memory allocation works? Does it have its own allocator or does it rely on C++ allocation? In Tensorflow, the Tensorflow memory allocator allocates most of the GPU memory (like 95%) at beginning independently of whether its needed or not. Because of this, when I was using C++ allocation instead of the Tensorflow allocator I would always run out of memory. You should check whether this is the case in pytorch.
Best,
Miguel
Hello,
inside the memory allocator on line 96, 102 and 106 are naked cuda calls. They doesn't deliver any debug information. For example if the gpu rans out of memory the program exists when it checks the createLattice function. This is a hard to find error. A simple solution would be to integrate some checks after the cuda calls.
greets Sebastian