AIM-Harvard / pyradiomics

Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks. Support: https://discourse.slicer.org/c/community/radiomics
http://pyradiomics.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.16k stars 500 forks source link

[FEAT EXTRACTION] Several features returning same values for different images #552

Closed GitHub-username-hyphen closed 4 years ago

GitHub-username-hyphen commented 4 years ago

Describe the bug Hi all,

I ran the pyradiomics code on some datasets recently. I have been getting some strange results - several features are the same. i.e., original_firstorder_Entropy - all -3.2E-16 original_firstorder_Uniformity - all 1 GLCM - all give me the same values (either 0, 1, or -3.20E-16)

The images contain very different values so all of this is unexpected...

Any insight ? Many thanks

Version (please complete the following information): OS: Windows Server 2012 R2 Python version: 3.6.1 PyRadiomics version 2.2.0

fedorov commented 4 years ago

It could be due to your choice of binning for those images. Did you try to increase the number of bins and see if this makes any difference?

GitHub-username-hyphen commented 4 years ago

Hi Fedorov, I am not sure how I can change the number of bins... Would you be able to describe and I will see if it's possible? Many thanks

fedorov commented 4 years ago

You can do this using this line in the settings file: https://github.com/Radiomics/pyradiomics/blob/master/examples/exampleSettings/Params.yaml#L21

GitHub-username-hyphen commented 4 years ago

Thanks Fedorov - I tested on a subset of data with binWidth = 50, but the results are still the same unfortunately.

JoaoSantinha commented 4 years ago

Hi, c Can you check which is the range of intensities in within the masks and change the bin width so that the number of bins would fall between 30 and 130 as suggested in https://pyradiomics.readthedocs.io/en/latest/faq.html#what-about-gray-value-discretization-fixed-bin-width-fixed-bin-count

How are you executing pyradiomics? Are you passing the parameter file correctly?

Hope this helps to find the issue

A terça, 14/01/2020, 17:28, GitHub-username-hyphen notifications@github.com escreveu:

Hi Fedorov, I tested on a subset of data with binWidth = 50, but the results are the same.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Radiomics/pyradiomics/issues/552?email_source=notifications&email_token=AGGB5ZLWAWZCFF5SEVV4GI3Q5XY3HA5CNFSM4KGWGJWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5OMNI#issuecomment-574285365, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGB5ZPEZ7UXDR22NCWIKCTQ5XY3HANCNFSM4KGWGJWA .

GitHub-username-hyphen commented 4 years ago

Hi Joao, The masks are binary, so the values are either 1 or 0. We're using a range of images, but have set our masks so that when overlaid on the images it should only cover positive values.

We originally used the default setting of binWidth = 25, but I changed it to 50 (arbitrary number) as suggested by Fedorov but still had the same result. We execute pyradiomics using PyCharm, which I believe calls upon the params.yaml file for the parameter file. Did not have this issue previously with other datasets.

JoaoSantinha commented 4 years ago

Sorry I was not clear, but the range I was referring to was the range of intensities of the image where your mask is one.

Can you out here the command/lines of code that are executing pyradiomics? So we better help you!

Thanks

A terça, 14/01/2020, 18:47, GitHub-username-hyphen notifications@github.com escreveu:

Hi Joao, The masks are binary, so the values are either 1 or 0. We're using a range of images, but have set our masks so that when overlaid on the images it should only cover positive values.

We originally used the default setting of binWidth = 25, but I changed it to 50 (arbitrary number) as suggested by Fedorov but still had the same result. We execute pyradiomics using PyCharm, which I believe calls upon the params.yaml file for the parameter file. Did not have this issue previously with other datasets.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Radiomics/pyradiomics/issues/552?email_source=notifications&email_token=AGGB5ZPIXVJU5KUVHW7VNPDQ5YCDXA5CNFSM4KGWGJWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5WDLA#issuecomment-574316972, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGB5ZMGT2UAT6L5WO5GQP3Q5YCDXANCNFSM4KGWGJWA .

GitHub-username-hyphen commented 4 years ago

Hi Joao, Sorry I am a novice at all of this - I think this is what you were asking for?

`from radiomics import featureextractor, getTestCase import six import sys, os

set constants

dataDir = 'E:\Texture Analysis Files\PyCharm Projects\SR_FE_Codes\IVIM_D_ThresholdedIndiv_20200110\Test' forYAML = 'E:\Texture Analysis Files\PyCharm Projects\SR_FE_Codes' params = os.path.join(forYAML, "examples", "exampleSettings", "Params.yaml") extractor = featureextractor.RadiomicsFeatureExtractor(params) a = []`

JoaoSantinha commented 4 years ago

That was exactly what i asked.

I think the issue might be that scale of your D from IVIM analysis. The value of bin width may be too high or to low for the values inside the mask. Depends on the units (mm²/s or other) and how those values are represented (·10^-6 or other).

I think the easiest way to understand the range of values would be to load the images and masks on 3D Slicer and in the Quantification select the Segmentation Statistics (or something similar), check the minimum and maximum and define the bin width accordingly.

Let me know if you were able to solve it

A terça, 14/01/2020, 19:21, GitHub-username-hyphen notifications@github.com escreveu:

Hi Joao, Sorry I am a novice at all of this - I think this is what you were asking for?

`from radiomics import featureextractor, getTestCase import six import sys, os set constants

dataDir = 'E:\Texture Analysis Files\PyCharm Projects\SR_FE_Codes\IVIM_D_ThresholdedIndiv_20200110\Test' forYAML = 'E:\Texture Analysis Files\PyCharm Projects\SR_FE_Codes' params = os.path.join(forYAML, "examples", "exampleSettings", "Params.yaml") extractor = featureextractor.RadiomicsFeatureExtractor(params) a = []`

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Radiomics/pyradiomics/issues/552?email_source=notifications&email_token=AGGB5ZP2EPWNOVFBG66UU53Q5YGFHA5CNFSM4KGWGJWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5ZQQQ#issuecomment-574330946, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGB5ZNKSMN6ZZGCVBVEGO3Q5YGFHANCNFSM4KGWGJWA .

fedorov commented 4 years ago

You could also try using the binCount parameter as described here in place of the binWidth: https://pyradiomics.readthedocs.io/en/latest/customization.html#feature-class-level

GitHub-username-hyphen commented 4 years ago

Hi Joao, Fedorov,

Thank you both for your suggestions. I have imported some of our cases into Slicer. We are looking at looking at different IVIM maps (i.e., ADC, D, f) with different thresholds set, so the min/max vary. Some examples below though:

ADC: 8.33816e-05 to 0.00217475 D: 6.71817e-05 to 0.3 f: 0.000594766 to 0.299469

I see that the suggestions appear to be to set binWidth between 30-130 but was wondering if either of you had suggestions as to what I could test? I am not sure which direction we should change binWidth to given our various scales on the different maps.

Thank you very much for your help,

JoaoSantinha commented 4 years ago

So the issue is that a bin width of 25 or 50 is too high for the range (maximum-minimum) of values of ADC, D and f. You can for instance select for ADC a bin width of (0.00217475 - 8.33816e-05)/80 where 80 is the number of bins. The value 80 is just an example for you to get a bin width. You can the do what os proposed in pyradiomics that would be to execute the feature extraction once and look at the firstorder_range and see if by dividing by your selected bin width the majority of the patients would have between 30 and 130 bins (not a bin width between 30 and 130).

You can try the solution by Andrey and you will see that those values will change.

Hope It was clear enough to help you

A terça, 14/01/2020, 21:10, GitHub-username-hyphen notifications@github.com escreveu:

Hi Joao, Fedorov,

Thank you both for your suggestions. I have imported some of our cases into Slicer. We are looking at looking at different IVIM maps (i.e., ADC, D, f) with different thresholds set, so the min/max vary. Some examples below though:

ADC: 8.33816e-05 to 0.00217475 D: 6.71817e-05 to 0.3 f: 0.000594766 to 0.299469

I see that the suggestions appear to be to set binWidth between 30-130 but was wondering if either of you had suggestions as to what I could test? I am not sure which direction we should change binWidth to given our various scales on the different maps.

Thank you very much for your help,

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Radiomics/pyradiomics/issues/552?email_source=notifications&email_token=AGGB5ZJMMWFJBVIEFAMQCOLQ5YS3PA5CNFSM4KGWGJWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI6ELLA#issuecomment-574375340, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGGB5ZOTFERRJZDRYKDQYPTQ5YS3PANCNFSM4KGWGJWA .

GitHub-username-hyphen commented 4 years ago

Thanks so much Joao, that was very helpful!

I'll try your suggestions and take a look to see what has been published as well.

On Jan. 14, 2020 at 16:31, Joao Santinha notifications@github.com wrote:

So the issue is that a bin width of 25 or 50 is too high for the range (maximum-minimum) of values of ADC, D and f. You can for instance select for ADC a bin width of (0.00217475 - 8.33816e-05)/80 where 80 is the number of bins. The value 80 is just an example for you to get a bin width. You can the do what os proposed in pyradiomics that would be to execute the feature extraction once and look at the firstorder_range and see if by dividing by your selected bin width the majority of the patients would have between 30 and 130 bins (not a bin width between 30 and 130).

You can try the solution by Andrey and you will see that those values will change.

Hope It was clear enough to help you

A terça, 14/01/2020, 21:10, GitHub-username-hyphen <notifications@github.com

escreveu:

Hi Joao, Fedorov,

Thank you both for your suggestions. I have imported some of our cases into Slicer. We are looking at looking at different IVIM maps (i.e., ADC, D, f) with different thresholds set, so the min/max vary. Some examples below though:

ADC: 8.33816e-05 to 0.00217475 D: 6.71817e-05 to 0.3 f: 0.000594766 to 0.299469

I see that the suggestions appear to be to set binWidth between 30-130 but was wondering if either of you had suggestions as to what I could test? I am not sure which direction we should change binWidth to given our various scales on the different maps.

Thank you very much for your help,

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/Radiomics/pyradiomics/issues/552?email_source=notifications&email_token=AGGB5ZJMMWFJBVIEFAMQCOLQ5YS3PA5CNFSM4KGWGJWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI6ELLA#issuecomment-574375340 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AGGB5ZOTFERRJZDRYKDQYPTQ5YS3PANCNFSM4KGWGJWA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Radiomics/pyradiomics/issues/552?email_source=notifications&email_token=AOAQCNIEPHMNDHUL2LV7SPTQ5YVKLA5CNFSM4KGWGJWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI6GKFI#issuecomment-574383381, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOAQCNPKNW3LUEBDZAUN6RLQ5YVKLANCNFSM4KGWGJWA .

GitHub-username-hyphen commented 4 years ago

Hi Fedorov, I am looking into adjusting binCount vs binWidth but cannot find a line in the code for binCount. Can you advise on where I can find this? Thank you

fedorov commented 4 years ago

It is easiest to experiment by using command line tool for feature extraction, and specify parameters in the config file. Can you start wit this config file, and specify binWidth in place of binCount?

GitHub-username-hyphen commented 4 years ago

Hi all,

I think this issue was resolved by changing the binWidth to reflect the scale of values in the image.

Thanks Joao, Fedorov!