AIM-Harvard / pyradiomics

Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks. Support: https://discourse.slicer.org/c/community/radiomics
http://pyradiomics.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.11k stars 485 forks source link

Stabilize dynamic binning for Wavelet transform output #49

Open vnarayan13 opened 8 years ago

vnarayan13 commented 8 years ago

Recompute bin edges and binwidth to maintain consistent bin count for all normalized stationary wavelet transform outputs.

JoostJM commented 7 years ago

A possible solution could be to apply imageoperations._scaleToOriginalRange, which ensures that the maximum of the filtered image is equal to that of the orignal image. This is currently already applied to the square, squareroot, logarithmic and exponential filters.

Thinking about it now though, this function does not address yet when the minimum changes. This can be done by shifting the filtered image, so that the minimums match after application of the scale (which should be defined in such a way, that after shifting/scaling, the filtered image maximum matches the original maximum). In other words, rescale/shift the filtered image to have it match the range of the original image.

fedorov commented 7 years ago

I wonder if we really need to maintain the bin count constant. Does it really make sense after an operation like wavelet?

vnarayan13 commented 7 years ago

I think its OK to not scale the wavelet-filtered grayscale ranges to the original image (assuming first order stats of a decomposition are comparable across images of a dataset).

Rather, compute bincount ("Ng") of the original image similar to in glcm.init: (https://github.com/Radiomics/pyradiomics/blob/master/radiomics/glcm.py#L86

Then, determine the binwidth that produces the above bincount (for each decomposition) and pass that binwidth to the glcm, glrlm, glszm (again, for each decomposition).

I think this best maps the original image bins to their corresponding "wavelet-transformed" bins, without rescaling/compressing data imputed from the transform.

fedorov commented 7 years ago

long time no see @vnarayan13 ! thanks for the comment

vnarayan13 commented 7 years ago

Also, I think the loop here to retrieve data = LLL.copy() is redundant since it occurs in the main dec loop: https://github.com/Radiomics/pyradiomics/blob/master/radiomics/imageoperations.py#L217

Also, "data" should be renamed to "approximation_data" and maybe should just be included in the "dec" dict since it is treated the same as the other decompositions? Returning "approximation, ret" was just to mimic the return structure of the 2D function in pywavelets.

JoostJM commented 7 years ago

Currently, it is already possible to set custom kwargs for the separate filters, this enables to use different fixed binwidths for the separate filters (e.g. 25 for original, but 5 for log)

JoostJM commented 7 years ago

Also, I think the loop here to retrieve data = LLL.copy() is redundant since it occurs in the main dec loop: https://github.com/Radiomics/pyradiomics/blob/master/radiomics/imageoperations.py#L217

This first loop is to get data to the correct level specified by start_level. Therefore this loop is not redundant. However, I expect it to be rarely used, as the first level wavelets are also usually considered.

JoostJM commented 7 years ago

@Radiomics/developers Following up on @vnarayan13's idea, is it a solution to use a fixed bincount on all filtered images, which is set by applying the fixed binWidth on the original image? Do we want to make this optional (i.e. provide a setting called "dynamicBinning")?

So in general, I think we have the following options:

I suggest we decide on a single way to handle this, or at the very least only supply a very limited number of options.

vnarayan13 commented 7 years ago

This might be a case where the results of an analytics/ml pipeline on labelled training data using the above options might determine which one elicits a stronger signal and should be the default opt.

matteawelch commented 5 years ago

Is there a plan to include dynamic binning as a function in future versions?

sha168 commented 5 years ago

I’m trying to extract voxel-wise features from the original MR image and its wavelet transforms, but I’m struggling with enabling the “dynamicBinning”.

In my .yaml-file I have tried the following:

imageType: Original: {} Wavelet: {} featureClass: glcm:

voxelSetting: kernelRadius: 1 maskedKernel: true initValue: nan voxelBatch: 10000

(am I supposed to provide the "binWidth" parameter for the original image, like I have done here?) However, this gives me the following error code:

--- All found errors --- ["Key 'dynamicBinning' was not defined. Path: '/setting'"] Traceback (most recent call last): File "", line 152, in File “/Users/…/radiomics/featureextractor.py”, line 59, in init self._applyParams(paramsFile=args[0]) File “/Users/…/featureextractor.py”, line 158, in _applyParams params = c.validate() File “/Users/…/core.py”, line 167, in validate error_msg=u'.\n - '.join(self.validation_errors))) pykwalify.errors.SchemaError: <SchemaError: error code 2: Schema validation failed:

Did I make a mistake in the parameter-file? I’m using version 2.1.2.post61+g3efae04.

JoostJM commented 5 years ago

@sha168, No your parameter file looks fine. The cause of the error is the PyRadiomics version. You are using the current master, but this does not have the dynamic binning functionality implemented. If you want to use this, you'll need to checkout the branch implement-dynamic-binning, which is located in my fork of PyRadiomics. If you then install from that branch, it should work.

sha168 commented 5 years ago

Ok, thanks for the quick reply! Is the bug addressed in this issue https://github.com/Radiomics/pyradiomics/issues/456 fixed in this branch?

JoostJM commented 5 years ago

@sha168, I just rebased the branch on the current master, so if you've checked out the branch with commit af7f32b, then yes.