AIM-Harvard / pyradiomics

Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks. Support: https://discourse.slicer.org/c/community/radiomics
http://pyradiomics.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.14k stars 492 forks source link

create examples for processing DICOM data #263

Open pieper opened 7 years ago

pieper commented 7 years ago

Goal Write an example analysis that uses a SlicerRT process TCIA example data with pyradiomics.

Open Questions

Background Probably each radiomics analysis will require some custom scripting to accommodate details of the data, but we should help people get started with real data.

Since a lot of people are interested applying radiomics to DICOM data we should have some worked out examples of how to perform these operations, but given the wide variety of use cases we need to decide which dependencies are acceptable and what sample data should be used for the demonstration.

In our developer hangout discussion we looked at several alternatives and while we would prefer to use pure python (pydicom) or SimpleITK, neither has much if any high level support for segmentation formats.

DICOM SEG could be supported by dcmqi, which may anyway be a dependency for writing out any radiomics results as SR, although pydicom may be enough for that.

Agreement seems to be that much existing segmentation data is in RTSTRUCT format, so an example processing some of that data would probably be most valuable to the community, however it is the most complex to handle. So using SlicerRT is probably the most workable overall solution.

To make it easiest to install and run Slicer plus the extension, it may make sense to use a docker container with Slicer, SlicerRT, and SlicerRadiomics already installed. Alternatively could have the script download and install all the needed packages or just prompt the user to do it manually.

Provide an example script that operates on the data and generates a radiomics table as a csv file.

pieper commented 7 years ago

Summary of discussion on Radiomics hangout

Eventual goal is to provide a radiomics-data package as a complement to the radiomics feature analysis code. The data package could be used with engineered or learned features.

pieper commented 7 years ago

SlicerRT already provides a converter script which should ba able to do the structure set conversion.

pieper commented 7 years ago

This command line successfully converts one NSCLC-Radiomics study downloaded from TCIA to nrrd.

/c/Program\ Files/Slicer\ 4.7.0-2017-06-11/Slicer.exe --no-main-window \
  --python-script /c/Users/pieper/AppData/Roaming/NA-MIC/Extensions-26083/SlicerRT/lib/Slicer-4.7/qt-scripted-modules/BatchStructureSetConversion.py \
  --input-folder ./DOI/LUNG1-001 \
  --export-images \
  --output-folder /d/data/nsclc-radiomics/converted

Using slightly modified version of the converter script.

After importing the full NSCLC-Radiomics directory tree into the Slicer dicom database, this command batch converts:

/c/Program\ Files/Slicer\ 4.7.0-2017-06-11/Slicer.exe --no-main-window \
  --python-script /c/pieper/radiomics/Converters/SlicerRT/BatchStructureSetConversion.py \
  --exist-db \
  --input-folder /d/data/nsclc-radiomics/db \
--export-images --output-folder /d/data/nsclc-radiomics/converted 2>&1 | tee convert.log

The log file shows that many studies do not successfully convert, These examples are run in gitbash on windows with the Slicer version indicated and the corresponding SlicerRT installed.

pieper commented 7 years ago

Will work on data conversion scripts in this repository while prototyping.

HIBhl commented 7 years ago

Why can`t pyradiomics handle DICOM? and why do you choose NRRD format?

JoostJM commented 7 years ago

@HIBhl, we show this example with the NRRD format, because it is the default format used in slicer. PyRadiomics also accepts for example NIFTII format.

The reason we do not support DICOM in PyRadiomics directly, is because it involves parsing the headers to determine the correct order of the files and to read them into one volume object. To program this would be a lot of double work, as it is already (better) implemented in other packages. We are however, working on creating scripts or wrappers to allow pyradiomics to work with DICOM, hence this issue.

fedorov commented 7 years ago

@pieper FYI might be useful: https://github.com/icometrix/dicom2nifti

I am investigating its functionality, will update this issue with my experience. I just think it is an overkill to use Slicer for loading image volumes, it would be nice to have a light standalone easy to use library just for this purpose.

fedorov commented 7 years ago

Seems to work for the basic CT series conversion I tried. Graceful handling of missing slices:

image

pieper commented 7 years ago

@fedorov Yes, something simple is better if it works well.

But can it convert the RT Structure Sets?

fedorov commented 7 years ago

I didn't check. I would be (very pleasantly!) surprised if it does though.

pieper commented 7 years ago

@fedorov do have you had good luck with dicom2nifti on various sample data? It appears the license is compatible with Slicer and we could consider using it as another approach for DWI, DCE, or even scalar volumes if it works well.

On Thu, Jul 6, 2017 at 6:23 PM, Andrey Fedorov notifications@github.com wrote:

I didn't check. I would be (very pleasantly!) surprised if it does though.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Radiomics/pyradiomics/issues/263#issuecomment-313535616, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHsfYNeeJR2oNMGqhBtVHPhiRnPDnLKks5sLV5agaJpZM4N5OEy .

fedorov commented 7 years ago

I have not tested extensively. The story is that I heard about it on LinkedIn from Dirk Smeets, and had a specific problem in hand to process the LIDC dataset, so in absence of a lot of alternatives (and not being a fan of Slicer --python-script), I gave it a try. I also thought about the possibility to include in Slicer. I dare to hope that perhaps it could replace DWIConvert? The group behind the tool is focused on neuroimaging, so they might have implemented that part. I say we should spend more time testing it and definitely consider including. Since it is available via pip, and pip install is already possible for Slicer build infrastructure, should be rather doable to include.

pieper commented 7 years ago

@ihnorton FYI

ihnorton commented 7 years ago

I can see a strong argument for deprecating both DWIConvert and DWI NRRD. NiFTI "won" for diffusion neuroimaging (which is most of DW imaging, probably).

pip install is already possible for Slicer build infrastructure, should be rather doable to include.

Only for about 30% of Slicer users.

fedorov commented 7 years ago

Only for about 30% of Slicer users.

I am not in the know to understand the subtlety!

ihnorton commented 7 years ago

Sorry - it doesn't really work for shared library extensions on Windows (68% of Slicer downloads even considering 4.6-nightly only)

pieper commented 7 years ago

I didn't look too closely but I think dicom2nifti is pure python, right?

On Fri, Jul 7, 2017 at 2:50 PM, Isaiah notifications@github.com wrote:

Sorry - it doesn't really work for shared library extensions on Windows (68% of Slicer downloads even considering 4.6-nightly only)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Radiomics/pyradiomics/issues/263#issuecomment-313763808, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHsfdZC5nJr2kpHKbeWb0jBwEEkYL1wks5sLn36gaJpZM4N5OEy .

HIBhl commented 7 years ago

@JoostJM Thanks, I Still have some confusions. The RT structure files which are DICOM format can be loaded and extracted together with volume files in 3Dslicer, but RT files cant be saved as nrrd format directly and they are saved as .vtm or .seg.vtm in Slicer, which can`t be used in pyradiomics. Can .vtm be convert to .nrrd? It seems that we can use Slicer directly for volume conversion but not for RT conversion, right?
The problem is how to convert structure files, as the recommended conversion script 'BatchStructureSetConversion.py' hasnt been widely practiced and may incur potential errors.

pieper commented 7 years ago

@HIBhl with the latest Slicer and SlicerRT (the nightly) structure sets will load as Segmentations and you can export them to Labelmaps that can be saved as nrrd.

https://www.slicer.org/wiki/Documentation/Nightly/Modules/Segmentations

Also, regarding the BatchStructureSetConversion.py I have a slightly improved version and while working on that we've found and (Kyle) fixed a couple issues in the conversion in case you want to try that again.

https://github.com/pieper/Converters/tree/master/SlicerRT

HIBhl commented 7 years ago

@pieper Thank you very much for your suggestions! I have tried the BatchStructureSetConversion.py, it works in converting the structure files into .nrrd format. But a problem reported is that the size of converted labelmap don`t match the size of volume file, which results in an error in the use of pyradiomics for extracting features. Are you faced with the same problem before? the labelmap file image the lmage file image Feature extracting results image

pieper commented 7 years ago

Sounds like you are very close. There are options in the SlicerRT/Segmentations code in slicer to set the reference geometry of the label map, but they aren't in the batch script. The physical spaces should be the same (based on the origin) which you can confirm by loading in Slicer to view.

But probably easiest is to address it in pyradiomics as described here:

http://pyradiomics.readthedocs.io/en/latest/faq.html?highlight=resample#geometry-mismatch-between-image-and-mask

Let us know how it goes, -Steve

On Fri, Aug 4, 2017 at 5:38 AM, lyuhuang notifications@github.com wrote:

@pieper https://github.com/pieper Thank you very much for your suggestions! I have tried the BatchStructureSetConversion.py, it works in converting the structure files into .nrrd format. But a problem reported is that the size of converted labelmap don`t match the size of volume file, which results in an error in the use of pyradiomics for extracting features. Are you faced with the same problem before? the labelmap file [image: image] https://user-images.githubusercontent.com/23743291/28962750-43eae398-7939-11e7-9688-084450186a41.png the lmage file [image: image] https://user-images.githubusercontent.com/23743291/28962767-53c77c2c-7939-11e7-94ab-91a3bd25df22.png Feature extracting results [image: image] https://user-images.githubusercontent.com/23743291/28963236-030d6c04-793b-11e7-99ba-d916b82bf5c1.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Radiomics/pyradiomics/issues/263#issuecomment-320205460, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHsfTJgKsWHRF2rCoamdqqjz9xuXMlhks5sUuaHgaJpZM4N5OEy .

huangmozhilv commented 7 years ago

@pieper Thank you for sharing the codes. But some errors raised when running the codes according to the log. And the obtained .nrrd files, input into pyradiomics's featureextractor, resulted in an error of " File "/Users/messi/pyradiomics/radiomics/imageoperations.py", line 238, in checkMask if "Both images for LabelStatisticsImageFilter don't match type or dimension!" in e.message:

AttributeError: 'RuntimeError' object has no attribute 'message'". Do you have any idea about how to fix this? Thanks very much.

JoostJM commented 7 years ago

@huangmozhilv, when storing a label map from slicer, it can be that it stores a cropped version of that labelmap, as all voxels outside the bounding box are 0 anyway, and alignment to the image is handled through origin and direction. However, by default PyRadiomics issues a warning, as it requires the mask and image to be of the same size and direction.

Because labelmaps can be stored as cropped versions, or sometimes you want to use a labelmap generated on another sequence (with e.g. different spacing), PyRadiomics provides an easy solution. By setting the settings parameter correctMask to True in your parameter file, PyRadiomics checks if the ROI defined in your label map file is completely within the image bounds, and if so, resamples the mask to the image dimension, spacing and direction.

JoostJM commented 7 years ago

All parameters in pyradiomics, including correctMask, are documented here.

huangmozhilv commented 7 years ago

@JoostJM It works well now. Thank you so much.

huangmozhilv commented 7 years ago

DICOM RTSTRUCT can be saved as .nrrd file in Slicer software: my version: the most recent nightly 4.7.0-2017-09-10 r26368. prerequisites: install extension 'slicerRT'. Also recommend to install extension 'slicerRadiomics' which works the same as 'pyradiomics' toolkit.

  1. module 'welcome to Slicer': load DICOM data;
  2. navigate to module 'Segmentations': on the left panel, scroll to 'Export/import models and labelmaps', options set to: operation, 'Export'; Output type, 'Labelmap'; Output node: '1:RTSTRUCT: SSLabel-label'.
  3. Click 'Export'.
  4. navigate to the top left conner of the GUI, click 'save'. Then you will see there are two .nrrd files for image seires and labelmap, respectively. Choose the directory to save the files. Click 'Save'.
huangmozhilv commented 7 years ago

@JoostJM @pieper I tried to do the conversion with a for-loop, as below: for i in {1..10}; do /Applications/Slicer.app/Contents/MacOS/Slicer --no-main-window --python-script /Users/messi/Documents/GitHub/SlicerRT-master/BatchProcessing/BatchStructureSetConversion.py --input-folder /Users/messi/Documents/Lung1_dataset/DOI/LUNG1-$(printf '%03d' $i) --export-images --output-folder /Users/messi/Downloads/TCIA/lungID${i}; done

Then I tried 2 sources of .nrrd files to compute features for 'lung002', 'featureextractor' from 'pyradiomics' was used, and 'exampleCT.yaml' was used as parameters settings. Source 1: Got .nrrd files using command-line as the above approach; Source 2: Got the .nrrd files in Slicer software as described in the last comment.

The problem is the two featurevectors are slightly different from each other. Could you help to figure out the reason? Thank you.

HIBhl commented 7 years ago

For bathprocessing, try BatchStructureSetConversion.py! The resample.py could solve the error: "Both images for LabelStatisticsImageFilter don't match type or dimension!". This solution was mentioned by pieper before.(Thanks again!) You could incoporate resampleMask.py into BatchStructureSetConversion.py to acquire the right nrrd file, whic works in LUNG1 set and all my local patients data(n=147)