AIM-Harvard / pyradiomics

Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks. Support: https://discourse.slicer.org/c/community/radiomics
http://pyradiomics.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.18k stars 497 forks source link

Voxel based extraction memory error [FEAT EXTRACTION] #615

Closed AT3432984 closed 4 years ago

AT3432984 commented 4 years ago

Hi I am trying to used the Pyradiomics voxel-based feature extractor and I am getting a memory error. Any ideas on how to fix this?

Here are my computer specs: image

Here is the error I am getting: image image

When I run it using the original image and no mask, it works fine (I type 'pyradiomics original.nrrd original.nrrd --mode voxel' in the command line), however, when I use the mask, it does not work ('pyradiomics original.nrrd mask.nrrd --mode voxel'

JoostJM commented 4 years ago

using the original image and no mask, it works fine (I type 'pyradiomics original.nrrd original.nrrd --mode voxel'

This is acutaly not what you are doing. You are actually using your image as the mask, mean you only extract features for those voxels where the value is 1 in the original image. PyRadiomics does not work without a mask. If you want values for all voxels, create a 'full mask' (image of same size/geometry as the input image, with all voxels set to 1).

That said, your issue is exactly as what is described, a memory error. Calculating voxel-wise is very memory intensive, as for each voxel that is calculated in a batch, glcm matrices are created (total size Ng x Ng x Nd x Nvox, with Ng being number of discretized gray values, Nd number of directions of offsets used (usually 13 for standard 3D extraction) and Nvox number of voxels to calculate in a single batch). The easiest to fix this is to set the voxelBatch parameter to a lower value (start with 10,000 or 1,000 and see if it works), by default, all segmentated voxels are calculated in 1 batch.

AT3432984 commented 4 years ago

Great thank you, I will try this.

AT3432984 commented 4 years ago

This is acutaly not what you are doing. You are actually using your image as the mask, mean you only extract features for those voxels where the value is 1 in the original image. PyRadiomics does not work without a mask. If you want values for all voxels, create a 'full mask' (image of same size/geometry as the input image, with all voxels set to 1).

That said, your issue is exactly as what is described, a memory error. Calculating voxel-wise is very memory intensive, as for each voxel that is calculated in a batch, glcm matrices are created (total size Ng x Ng x Nd x Nvox, with Ng being number of discretized gray values, Nd number of directions of offsets used (usually 13 for standard 3D extraction) and Nvox number of voxels to calculate in a single batch). The easiest to fix this is to set the voxelBatch parameter to a lower value (start with 10,000 or 1,000 and see if it works), by default, all segmentated voxels are calculated in 1 batch.

Hi again, do you mind telling me how I can change this parameter? Is there a way I can do this straight from the command line or do I need to create a parameter file as it says to do in 'customizing the extraction'?

I tried adding -s voxelBatch:1000 but this did not work

JoostJM commented 4 years ago

In what way did it not work? Was the setting accepted on the commandline (I.e. does the extraction at least start). What version of PyRadiomics do you have?

Does it work if you use a parameter file?

AT3432984 commented 4 years ago

I typed into the commandline:

pyradiomics image.nrrd mask.nrrd -s voxelBatch:1000 --mode voxel

and it said:

W: radiomics.script: Did not recognize override "voxelBatch", skipping...

and then it just tries to run normally. Am I typing something in wrong? (Sorry I am very new to this) I am using Pyradiomics version 3.0. I have not tried using a parameter file yet as I would like to just be able to change the settings from the commandline.

JoostJM commented 4 years ago

It may be possible that it doesn't include voxel specific settings for overrides, I'll investigate. In the meantime, my advise would be to use a parameter file.

AT3432984 commented 4 years ago

Hi again,

I have created a parameter file and it seems to be running now, though I am getting an error saying "IndexError: Calculation of GLSZM Failed"

My parameter file contains: imageType: Original: {}

voxelSetting: voxelBatch: 1000

AT3432984 commented 4 years ago

I can extract all of the other features but not the GLSZM, not sure why this is happening but I will try and figure it out. At least the other ones are working now, thank you

JoostJM commented 4 years ago

@AT3432984, There have several reports of GLSZM failing (the index error indicates it is failing in the C extension). I am currently investigating this. Can you share a case of image, mask and parameter file that causes this error? It may help me track down the issue more easily.

JoostJM commented 4 years ago

Related to #617

AT3432984 commented 4 years ago

Sure, here are the image, mask and parameter file. I changed the parameter file so that it only extracts GLSZM.

NewFolder.zip

Thibescobar commented 4 years ago

Hello everyone,

Thank you very much for your help @JoostJM !

I just have a question related to this issue an to the #617 : does the "index error" correspond to the same problem as the "memory error" ? I am a bit confused on how to fix the problem, because trying to decrease the voxelBatch parameter to 1 still give me the "index error". The problem always occur but at a different "step", for example : batch no. 314/36502 for voxelBatch = 10 batch no. 32/3651 for voxelBatch = 100. Its seems indicate that the problem occur at a certain number of calculated voxels, independently of the voxelBatch setting... Am I right ?

Is there a solution to obtain the GLSZM feature map ?

Thank you again.

JoostJM commented 4 years ago

Index error is indeed different from Memory error, the first occurs during matrix calculation, the latter when not enough memory is available to allocate memory to store the matrix, prior to calculation.

Thibescobar commented 4 years ago

Thank you @JoostJM.

As you were saying you are investigating the index error, do you know if there is a way ton avoid this error and compute the GLSZM features maps ?

Thibescobar commented 4 years ago

Hello,

I would like to update the subject in order to know if the problem of GLSZM feature cards is a problem in general (for everyone) or if it dependents on the user and his hardware/software ?

If it is indeed possible to calculate them, can you tell me please? In which case I will try on another machine. It would be great if you know a version or any other information related to this failure. Or on the contrary a configuration where it works.

Thank you very much in advance.

JoostJM commented 4 years ago

@AT3432984, I believe I have found the bug. For the case you sent, it gave me an IndexError, which was due to a faulty check in GLSZM's C calculation. I have made a PR of the fix (#635), which I will merge after checks pass.

I will also be making a new release of PyRadiomics soon, which will contain this fix.

JoostJM commented 4 years ago

@Thibescobar, the error occured in specific cases only (where all voxels inside a kernel belong to the same region). This filled up a temporary buffer, and a faulty check raised the error when trying to fill the last element (which should be allowed).

AT3432984 commented 4 years ago

@JoostJM Thank you so much, do you have an idea on when this will be released?

JoostJM commented 4 years ago

@AT3432984, the fix is now merged in the master branch. However, I'm currently unable to make a new release due to a (different) issue in the Mac testing CI (using TravisCI).

You can try to compile PyRadiomics from source by checking out the current master of this repository and running setup.py install (from repository root). This involves compiling the C Extensions, if you have any issues, let me know. In the worst case, I can build you a wheel (for either windows or linux), if you need this, please let me know your OS type and specific Python version.

Thibescobar commented 4 years ago

@JoostJM Thank you very much ! I'll try to compile on monday and let you know how it has gone. Thank you again for your answers. Have a good day.

dain5832 commented 3 years ago

Hi I'm having exactly the same error on my jupyter notebook. I uninstalled the package and tried to compile it from source, but the problem still exists. Can you help me with this issue??

Thibescobar commented 3 years ago

Hi I'm having exactly the same error on my jupyter notebook. I uninstalled the package and tried to compile it from source, but the problem still exists. Can you help me with this issue??

Hello @dain5832, I could not fix this issue neither. But in reality it is not a big problem in term of signal analysis, because if we look at what GLSZM code in term of information, in a local environment (so in the context here), this is extremely similar to what code GLRLM when averaged in all directions, what it is the case by default. Even GLDM features can be correlated to GLSZM ones by mathematical definition (see the definition of the GLSZM, GLRLM, GLDM matrices and features here and maybe take a look at this old very informative paper). So for describing your local patterns with handcrafted features, local GLRLM and GLDM can describe nearly the same thing as GLSZM.

@JoostJM do you agree ?

Have a good day.