balbasty / nitorch

Neuroimaging in PyTorch
Other
83 stars 14 forks source link

TorchScript implementation of Push/Pull + spline coefficients #56

Closed balbasty closed 2 years ago

balbasty commented 2 years ago

I have reimplemented all the low-level push/pull utilities in TorchScript. TorchScript is a strongly typed subset of python+pytorch that gets just-in-time compiled. It's main advantage is the ability to fuse sequences of voxel-wise operations into a single cuda kernel.

It means that we now have a version of nitorch that works without needing to compile any C++/CUDA code. It is however quite slower than the C++/CUDA version. Here' s what I get when pulling a [192, 192, 192] image with 1st order splines:

C TorchScript CPU CUDA TorchScript GPU
0.1 s 1s 0.7 ms 5 ms

To install nitorch without compiling the C++/CUDA code, use: NI_COMPILED_BACKEND="TS" python setup.py install|develop By default, NI_COMPILED_BACKEND="C" and the C++/CUDA extensions are compiled. It is also possible to use NI_COMPILED_BACKEND="MONAI"to try using MONAI's version of push/pull, but last time I checked, they were not available in the pip version of MONAI.

When calling nitorch, it first tries to load the C components, then MONAI, then the TorchScript implementation. It means that if you have used the develop mode and have the compiled code lying around, it will be used -- even if you used NI_COMPILED_BACKEND="TS" python setup.py develop afterwards.

To force nitorch to use the TorchScript code, a solution is to set the environment variable NI_COMPILED_BACKEND="TS" before importing nitorch.


I have also ported bsplinc from SPM. It is a prefiltering that returns interpolating spline coefficients. It is needed to perform high order (> 1) resampling. You can use either ni.spatial.spline_coeff of ni.spatial.spline_coeff_nd. It is only implemented for boundary conditions dft and dct1.

Filtering a [192, 192, 192] volume takes about 170 ms on the CPU and 20 ms on the GPU.

@brudfors