AIM-Harvard / pyradiomics

Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks. Support: https://discourse.slicer.org/c/community/radiomics
http://pyradiomics.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.16k stars 500 forks source link

Profiling PyRadiomics for very large segmentations #256

Closed JoostJM closed 7 years ago

JoostJM commented 7 years ago

Profiling PyRadiomics using the brain1 testcase (256x256x25 voxels) with full-mask

Main bottlenecks:

profiling_full_mask

cc @Radiomics/developers

pieper commented 7 years ago

@blezek I mentioned to @Radiomics/developers that you'd run into performance issues trying to run radiomics on whole brains so @JoostJM did some profiling.

pieper commented 7 years ago

From a quick look at the glszm code it looks like there's a lot of room to optimize the implementation.

JoostJM commented 7 years ago

In the c code? Do you have a general idea how to tackle it?

jcfr commented 7 years ago

In the c code? Do you have a general idea how to tackle it?

May be you could look into using threads

That said, using threading libraries with vanilla c-code may be challenging to get right.

I think that implementing ITK filters could be helpful (e.g as ITK remote module), you would then be able to leverage the associated cross-platform image processing and threading API. These filters could then be be exposed through SimpleITK

@thewtex @jbvimort I wonder if you have any insights ...

pieper commented 7 years ago

It would be best to do a line-by-line profiling, but my guess is that this set of nested loops is pretty slow. Using numpy.where can be slow and will return long lists of coordinates.

I agree with Jc that this operation could be threaded since each pixel should be independent. Also a SimpleITK implementation of the same operation could be way faster. But I'd start by doing the profiling and see if there are ways to just optimize the existing python code.

fedorov commented 7 years ago

@pieper it appears from the profiling that Joost already did that _calculateCMatrix accounts for 77% of the time (if I interpret the profiling summary correctly):

image

pieper commented 7 years ago

I see, you are saying it is probably this call to this code. Yes, that looks pretty compute intensive.

If this feature is really informative, and large image masks are an important use case, then it could be that the "simple C library" approach is not enough, at least as currently written. It's possible the C code could be made more efficient or we could explore other options like threading or SimpleITK implementation.

JoostJM commented 7 years ago

@pieper @fedorov, it's indeed this call. I'll take a look to see if it can be made more efficient.

If I can, I'd like to avoid implementing multithreading here for 2 reasons

  1. This will make the C code a lot more complex
  2. When I extract features for multiple patients, I increase performance by multithreading on a patient level (each patient 1 thread). In this case I also set SimpleITK to single thread to keep the patient extraction single thread.
JoostJM commented 7 years ago

On a side note, the main reason c GLSZM takes so much time is because it is computationally intensive and is called 12 times (due to the different derived images). The call to SimpleITK's labelshapestatistics image filter is the call that takes the most time for a single call, but is only called once per extraction (as it is part of shape and therefore only calculated on original image type).

JoostJM commented 7 years ago

As to the SimpleITK call taking so much time. This is due to the compute Feret diameter (max 3D diameter). If you run labelshapestatistics image filter with and without this diameter there is a time difference of 271053 ms vs 21 ms for the full brain1 mask.

thewtex commented 7 years ago

There is now a Python package available, itk-texturefeatures, that could be helpful here.

pieper commented 7 years ago

@JoostJM did you have a look at itk-texturefeatures? It would be interesting to know how the speed compares and if the feature sets are comparable to what we have now in the pyradiomics c code.

https://github.com/InsightSoftwareConsortium/ITKTextureFeatures/tree/2b4544fe39e0ece9007a0d87c396c8586c6f4df5/example

JoostJM commented 7 years ago

@pieper, Just returned from my holiday, but this is certainly interesting. I did some profiling using PyRadiomics and similar mask as were used by the ITK texture features (implementing a voxel wise calculation). Pyradiomics performance was similar to ITK's performance. Downside of PyRadiomics is that it only accepts 3D input, whereas ITK is N-dimensional. On the other hand, PyRadiomics acchieves this performance in single thread mode and calculates more features in GLCM.

In terms of featureset, as far as I can see the formula definitions are similar, with PyRadiomics having more features in both GLCM and GLRLM. ITK's texture features has only 1 feature (in GLCM) that PyRadiomics doesn't have, but this is a similar implementation of PyRadiomics' correlation, which is present in ITK as Haralick's correlation. That being said, I still believe there will be some differences, due to the fact that ITK used a fixed bin count, as opposed to the fixed bin width in PyRadiomics.

pieper commented 7 years ago

Thanks for checking @JoostJM -- sounds like we should decide if we want to keep an independent implementation in PyRadiomics or work together with the ITK community to improve itk-texturefeatures. My knee-jerk reaction is to prefer the community effort whenever possible in the hopes that code maintenance will be more efficient.