optimal padding that is more efficient than log2

VolkerH commented 5 years ago

this is a draft piull request to let you know I'm working on this so we don't duplicate effort by accident.

I found the log2 padding to be very wasteful with VRAM in worst cast scenarios, e.g. an input array of shape (129, 513, 513) will end up as (256, 1024, 1024) which is almost 8 times as large. In addition to wasting valuable VRAM this will also impact the other computations that have to process many more voxels.

Adding an option OPM_2357. Using a precomputed rainbow table (whether this is a good approach is open to debate but it is fast and has no dependencies).

Time to go to bed, will hopefully test tomorrow.

eric-czech commented 5 years ago

Thanks man! Sounds good to me and feel free to make that the default mode. That seems like a better choice since there was another issue or two from people hitting GPU memory issues immediately with so much padding.

VolkerH commented 5 years ago

Ok, this was more difficult than I thought ... I naively assumed I would be able to mix in some numpy. Now implemented the rainbow table lookup with tensorflow ops (however, with a less efficient search). Added some tests as well.

Some comments:

the rainbow table only contains a few thousand entries, up to a maximum value of 14224896000 with no fallback to calculate arbitrary numbers. Even for a 1d-Array that number of elements is likely to exceed GPU VRAM for a while. But then again, programmers in 1980 didn't think their programs would still be used in Y2K. However, a larger rainbow table is easily created.
how to handle dimensions of length one? Your first optimize_dim test indicates that you leave them at one. I guess that is 2**0, so maybe it should be included. However, I think singleton dimensions should warrant initializing with a smaller dimension. Including the 1 (if desired to make it consisten) is easy just insert the 1 into the rainbow table (and fix the test).
I did not dare to make OPF_2357 the default option yet.

eric-czech commented 5 years ago

LGTM -- I'm not sure why that old test is failing in the build yet but I don't see how any of those changes would interfere with existing functionality so I'll patch it up. I think as a part of the fix for https://github.com/hammerlab/flowdec/issues/17, I'll want singleton dimensions to not get squeezed out (i.e. dropped), if I'm understanding you correctly, since that should make it easier to track spatial and batch dimensions.

Thanks again for putting this together!

VolkerH commented 5 years ago

Thanks for including the pull request

I think as a part of the fix for #17, I'll want singleton dimensions to not get squeezed out (i.e. dropped), if I'm understanding you correctly, since that should make it easier to track spatial and batch dimensions.

Ok, if I understand you correctly this might require a (simple) change to my code. My code will turn a singleton dimension into a "2"

(1, 128, 128) -> (2, 128, 128)

The simple fix is to include the number "1" as first list element of the rainbow table. In addition, this will also require a change to this test (change 2 to 1): https://github.com/hammerlab/flowdec/blob/6b92c32036ff6e3746c15268bb2902519fef0cc5/python/flowdec/fft_utils_test.py#L49

hammerlab / flowdec

optimal padding that is more efficient than log2 #18