Closed yousefmoazzam closed 6 months ago
The second option may be possible in DezingingWrapper
. However, naively attempting to transfer the block to CPU after the projections have been processed (in order to transfer the projections back to CPU before the darks are processed) via block.to_cpu()
causes the block as a whole to be seen as "on the CPU", because the block.is_gpu
getter is defined in terms of the block.is_cpu
getter:
https://github.com/DiamondLightSource/httomo/blob/d9afc1f04e29a375cee17d33033d75f47a99da34/httomo/runner/dataset.py#L108-L110
which in turn checks if self._data
has a device
attribute or not:
https://github.com/DiamondLightSource/httomo/blob/d9afc1f04e29a375cee17d33033d75f47a99da34/httomo/runner/dataset.py#L104-L106
self._data
would be the projections, and this is transferred to CPU via the block.to_cpu()
call.
This all in turn causes the block.darks
getter to not transfer them to GPU, because it uses block.is_gpu
internally to decide whether or not to transfer to GPU, and block.is_gpu = False
after the block.to_cpu()
call to transfer projections to CPU.
Note that even with 9f60287 as an experimental attempt to transfer + process projections, darks, flats one-by-one in sequence, the remove_outlier
method requires enough GPU memory to hold both:
in GPU memory, so the available GPU memory needs to be able to hold double the size of the darks or flats.
On GPUs with small memory, but large enough to hold all darks + all flats, but not always at the same time as holding a block in GPU memory (depending on the block size), the
remove_outlier
method fails with a CUDA OOM error.With 20GB data that has:
pc0074
(which has a GPU with 2GB of memory), is able to hold all darks and all flats in GPU memory. But, if the block size doesn't take the size of the darks/flats into account, then the block will be made too large to be able to fit all the darks and all the flats in GPU memory along with the block.In
feature/transparent-file-store
, the state of theDezingingWrapper
is such that it keeps a block, all darks, and all flats, in GPU memory before execution is returned to the task runner (which would transfer data to CPU when it needs to be written to the data store): https://github.com/DiamondLightSource/httomo/blob/d9afc1f04e29a375cee17d33033d75f47a99da34/httomo/method_wrappers/dezinging.py#L57-L62This seems to not agree with what the memory estimation for
remove_outlier
now is (changes were recently done to memory estimation ofremove_outlier
in #239), where accounting for darks/flats looks to not be present: https://github.com/DiamondLightSource/httomo/blob/d9afc1f04e29a375cee17d33033d75f47a99da34/httomo/methods_database/packages/external/httomolibgpu/1.2/httomolibgpu.yaml#L12-L20It seems to be the case that either:
remove_outlier
needs to account for darks/flats