AIM-Harvard / pyradiomics

Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks. Support: https://discourse.slicer.org/c/community/radiomics
http://pyradiomics.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.14k stars 493 forks source link

There are many NaN, 0 and 1 in the feature values #379

Closed QiChen2014 closed 6 years ago

QiChen2014 commented 6 years ago

Hi everyone,I'm trying to extract radiomics features from some 3D CT images. I referred to the example file which named batchprocessing.py to handle our renal cancer images, and got more than 1000 features. But there are many NaN, 0 and 1 in the feature values. I have checked the voxelNum as the document said, the minimum value is 575, and the maximum value is 350221. In addition, I compared the range of gray values in the ROI (a First Order feature) with the value 1, 5, 10, 15, 20, 25 for the binWidth parameter respectively, then I got the same result. Why? How should I solve this problem? Could some body help me? Thanks!

JoostJM commented 6 years ago

as to you many NaN, 0 and 1 values, that usually happens in case of a flat region (when the range of gray values inside the roi < binwidth).

binWidth: most firstorder features are not affected by this setting (only entropy and uniformity). binWidth mainly serves the required discretization for texture matrix calculation.

If you see a lot of strange and/or unexpected values, it may help to run PyRadiomics with logging enabled @ level DEBUG.

QiChen2014 commented 6 years ago

Thank you @JoostJM , I'll try it.

QiChen2014 commented 6 years ago

Hi @JoostJM , I have a big problem now. I used "batchprocessing.py" to process 473 cases, and 277 of them worked fine, but others failed with "MemoryError". The error information in the log file like this:

INFO:radiomics.batch: (278/473) Processing Patient (Image: D:\Data\data\Renal cancer\zunyi\jingmai\img-mask\1718531\1718531_series1.nrrd, Mask: D:\Data\data\Renal cancer\zunyi\jingmai\img-mask\1718531\1718531.nii) INFO:radiomics.featureextractor: Calculating features with label: 1 DEBUG:radiomics.featureextractor: Enabled images types: {'Original': {}, 'Wavelet': {}, 'Square': {}, 'SquareRoot': {}, 'Logarithm': {}, 'Exponential': {}, 'Gradient': {}, 'LBP3D': {}} DEBUG:radiomics.featureextractor: Enabled features: {'shape': [], 'firstorder': [], 'glcm': [], 'glrlm': [], 'gldm': [], 'glszm': [], 'ngtdm': []} DEBUG:radiomics.featureextractor: Current settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} INFO:radiomics.featureextractor: Loading image and mask DEBUG:radiomics.imageoperations: Normalizing image with scale 500 DEBUG:radiomics.imageoperations: Checking mask with label 1 DEBUG:radiomics.imageoperations: Calculating bounding box DEBUG:radiomics.imageoperations: Checking minimum number of dimensions requirements (1) DEBUG:radiomics.featureextractor: Image and Mask loaded and valid, starting extraction INFO:radiomics.featureextractor: Adding additional extraction information DEBUG:radiomics.imageoperations: Cropping to size [115 139 12] INFO:radiomics.featureextractor: Computing shape DEBUG:radiomics.shape: Initializing feature class DEBUG:radiomics.shape: Padding the mask with 0s DEBUG:radiomics.shape: Pre-calculate Volume, Surface Area and Eigenvalues DEBUG:radiomics.shape: Calculating Surface Area in C DEBUG:radiomics.shape: Shape feature class initialized DEBUG:radiomics.shape: Calculating features DEBUG:radiomics.shape: Calculating Maximum 3D diameter in C DEBUG:radiomics.featureextractor: Creating image type iterator INFO:radiomics.featureextractor: Adding image type "Original" with settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} INFO:radiomics.featureextractor: Adding image type "Wavelet" with settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} INFO:radiomics.featureextractor: Adding image type "Square" with settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} INFO:radiomics.featureextractor: Adding image type "SquareRoot" with settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} INFO:radiomics.featureextractor: Adding image type "Logarithm" with settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} INFO:radiomics.featureextractor: Adding image type "Exponential" with settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} INFO:radiomics.featureextractor: Adding image type "Gradient" with settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} INFO:radiomics.featureextractor: Adding image type "LBP3D" with settings: {'minimumROIDimensions': 1, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 500, 'removeOutliers': None, 'resampledPixelSpacing': None, 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'enableCExtensions': True, 'additionalInfo': True, 'binWidth': 10, 'correctMask': True} DEBUG:radiomics.featureextractor: Extracting features DEBUG:radiomics.imageoperations: Yielding original image INFO:radiomics.featureextractor: Calculating features for original image DEBUG:radiomics.imageoperations: Cropping to size [115 139 12] INFO:radiomics.featureextractor: Computing firstorder DEBUG:radiomics.firstorder: Initializing feature class DEBUG:radiomics.firstorder: First order feature class initialized DEBUG:radiomics.firstorder: Calculating features DEBUG:radiomics.imageoperations: Calculated 29 bins for bin width 10 with edges: [340. 350. 360. 370. 380. 390. 400. 410. 420. 430. 440. 450. 460. 470. 480. 490. 500. 510. 520. 530. 540. 550. 560. 570. 580. 590. 600. 610. 620. 630.]) DEBUG:radiomics.imageoperations: Calculated 29 bins for bin width 10 with edges: [340. 350. 360. 370. 380. 390. 400. 410. 420. 430. 440. 450. 460. 470. 480. 490. 500. 510. 520. 530. 540. 550. 560. 570. 580. 590. 600. 610. 620. 630.]) INFO:radiomics.featureextractor: Computing glcm DEBUG:radiomics.glcm: Initializing feature class DEBUG:radiomics.imageoperations: Discretizing gray levels inside ROI DEBUG:radiomics.imageoperations: Calculated 29 bins for bin width 10 with edges: [340. 350. 360. 370. 380. 390. 400. 410. 420. 430. 440. 450. 460. 470. 480. 490. 500. 510. 520. 530. 540. 550. 560. 570. 580. 590. 600. 610. 620. 630.]) DEBUG:radiomics.glcm: Calculating GLCM matrix in C DEBUG:radiomics.glcm: Process calculated matrix DEBUG:radiomics.glcm: Create symmetrical matrix DEBUG:radiomics.glcm: No empty angles DEBUG:radiomics.glcm: Calculating GLCM coefficients DEBUG:radiomics.glcm: GLCM feature class initialized, calculated GLCM with shape (29, 29, 13) DEBUG:radiomics.glcm: Calculating features WARNING:radiomics.glcm: GLCM is symmetrical, therefore Sum Average = 2 * Joint Average, only 1 needs to be calculated INFO:radiomics.featureextractor: Computing glrlm DEBUG:radiomics.glrlm: Initializing feature class DEBUG:radiomics.imageoperations: Discretizing gray levels inside ROI DEBUG:radiomics.imageoperations: Calculated 29 bins for bin width 10 with edges: [340. 350. 360. 370. 380. 390. 400. 410. 420. 430. 440. 450. 460. 470. 480. 490. 500. 510. 520. 530. 540. 550. 560. 570. 580. 590. 600. 610. 620. 630.]) DEBUG:radiomics.glrlm: Calculating GLRLM matrix in C DEBUG:radiomics.glrlm: Process calculated matrix DEBUG:radiomics.glrlm: No empty angles DEBUG:radiomics.glrlm: Calculating GLRLM coefficients DEBUG:radiomics.glrlm: GLRLM feature class initialized, calculated GLRLM with shape (29, 16, 13) DEBUG:radiomics.glrlm: Calculating features INFO:radiomics.featureextractor: Computing gldm DEBUG:radiomics.gldm: Initializing feature class DEBUG:radiomics.imageoperations: Discretizing gray levels inside ROI DEBUG:radiomics.imageoperations: Calculated 29 bins for bin width 10 with edges: [340. 350. 360. 370. 380. 390. 400. 410. 420. 430. 440. 450. 460. 470. 480. 490. 500. 510. 520. 530. 540. 550. 560. 570. 580. 590. 600. 610. 620. 630.]) DEBUG:radiomics.gldm: Feature class initialized, calculated GLDM with shape (29, 23) DEBUG:radiomics.gldm: Calculating features INFO:radiomics.featureextractor: Computing glszm DEBUG:radiomics.glszm: Initializing feature class DEBUG:radiomics.imageoperations: Discretizing gray levels inside ROI DEBUG:radiomics.imageoperations: Calculated 29 bins for bin width 10 with edges: [340. 350. 360. 370. 380. 390. 400. 410. 420. 430. 440. 450. 460. 470. 480. 490. 500. 510. 520. 530. 540. 550. 560. 570. 580. 590. 600. 610. 620. 630.]) DEBUG:radiomics.glszm: Calculating GLSZM matrix in C DEBUG:radiomics.glszm: Calculating GLSZM coefficients DEBUG:radiomics.glszm: GLSZM feature class initialized, calculated GLSZM with shape (29, 94) DEBUG:radiomics.glszm: Calculating features INFO:radiomics.featureextractor: Computing ngtdm DEBUG:radiomics.ngtdm: Initializing feature class DEBUG:radiomics.imageoperations: Discretizing gray levels inside ROI DEBUG:radiomics.imageoperations: Calculated 29 bins for bin width 10 with edges: [340. 350. 360. 370. 380. 390. 400. 410. 420. 430. 440. 450. 460. 470. 480. 490. 500. 510. 520. 530. 540. 550. 560. 570. 580. 590. 600. 610. 620. 630.]) DEBUG:radiomics.ngtdm: Calculating features DEBUG:radiomics.imageoperations: Generating Wavelet images ERROR:radiomics.batch: FEATURE EXTRACTION FAILED Traceback (most recent call last): File "D:/Data/project/PycharmProjects/pyradiomics/cq/batchProcess.py", line 97, in main featureVector.update(extractor.execute(imageFilepath, maskFilepath, label)) File "D:\Data\project\PycharmProjects\pyradiomics\radiomics\featureextractor.py", line 440, in execute for inputImage, imageTypeName, inputKwargs in imageGenerators: File "D:\Data\project\PycharmProjects\pyradiomics\radiomics\imageoperations.py", line 748, in getWaveletImage approx, ret = _swt3(inputImage, kwargs.get('wavelet', 'coif1'), kwargs.get('level', 1), kwargs.get('start_level', 0), axes=tuple(axes)) File "D:\Data\project\PycharmProjects\pyradiomics\radiomics\imageoperations.py", line 796, in _swt3 sitkImage = sitk.GetImageFromArray(decTemp) File "D:\Data\project\PycharmProjects\pyradiomics\venv\lib\site-packages\SimpleITK\SimpleITK.py", line 3436, in GetImageFromArray _SimpleITK._SetImageFromArray( z.tostring(), img ) MemoryError

Then I processed it separately, and no error occurred. So, should I deal with these 473 cases separately? By the way, what value should be set to parameter "normalizeScale" to do normalization?

Thank you very much!

JoostJM commented 6 years ago

What kind of processor architecture do you have? In 32-bits, SimpleITK sometimes has these memory errors, which is why we advise to use PyRadiomics in 64-bits

QiChen2014 commented 6 years ago

It's 64-bits, windows 10 home edition, 8G RAM

JoostJM commented 6 years ago

Ah wait. I think I know the problem. You don't enable resampling or precrop, meaning that your entire image gets passed to the filter function. Especially in CT, one such image can be quite large. (e.g. ~50 -150 mb's or so). Wavelet filter is then the only filter that computes all of it's derived images immediatly, meaning that at one point, you have the original + 8 derivations in your memory, which can take up a lot of memory. Add the fact that when you get they numpy array, this increases even more, and you're left with the fact that 8GB of RAM is woefully short of what your computer would like to have available.

This is the reason the preCrop parameter exists, as this crops the image onto the bounding box (with additional padding to keep the filter output as close to the original (without cropping) as possible. for wavelet, I'd advise a padding of about 10 or so.

That being said, I saw you are extracting in 3D, but do not resample to an isotropic voxel size. This breaks the assumption that distances between neighbors (in infinity norm) is equal, as this requires isotropic voxels in 3D, which is usually not the case in medical imaging. I would advise to either enable resampling to a common spacing, only extract in 2D (force2D parameter, still uses the entire volume, but does not consider voxels on adjoining slices to be neighbors) or enable matrix weighting (set weightingNorm parameter)

QiChen2014 commented 6 years ago

Thank you @JoostJM ,I get it now. As you said, I set the parameters like this:

imageType: Original: {} # No customized settings for original filter Wavelet: {} Square: {} SquareRoot: {} Logarithm: {} Exponential: {} Gradient: {} LBP3D: {} featureClass: shape: firstorder: glcm: glrlm: gldm: glszm: ngtdm: setting: binWidth: 10 correctMask: True label: 1 preCrop: True padDistance: 10 force2D: True interpolator: 'sitkBSpline' resampledPixelSpacing: [3, 3, 3] #because slice thickness between 2.5 and 7 mm normalize: True normalizeScale: 500

is it right? Sincere thanks to your help.

JoostJM commented 6 years ago

As you are enabling resampling to [3, 3, 3] you don't have to enable preCrop: False (has no effect as resampling is enabled, and performs a similar action).

Also, this results in isotropic voxels, and you can extract features in 3D (set force2D to False). This is only needed when there are isotropic voxels in-plane, but have a different size out-of-plane.

QiChen2014 commented 6 years ago

Thanks very much for your help! I really appreciate it.