moloney / dcmstack

DICOM to Nifti conversion with meta data preservation
Other
72 stars 51 forks source link

Conversion overflow #49

Closed ashgillman closed 6 years ago

ashgillman commented 7 years ago

Hi, When converting some PET images I get negative intensities in the output NIfTI for voxels with intensities > 2^15 on the original DICOM (i.e., probably a signed int overflow). I believe this is probably an upstream issue (pydicom or nibabel), and I am happy to report there and track. I am hoping first to see your thoughts on the issue.

image

Below is the DICOM header data of an offending image for reference:


    Dicom-Meta-Information-Header
    Used TransferSyntax:
    (0002,0000) UL 180 # 4,1 File Meta Information Group Length
    (0002,0001) OB 00\01 # 2,1 File Meta Information Version
    (0002,0002) UI [1.2.840.10008.5.1.4.1.1.128] # 28,1 Media Storage SOP Class UID
    (0002,0003) UI [1.3.12.2.1107.5.2.38.51040.2016110110532420312500560] # 52,1 Media Storage SOP Instance UID
    (0002,0010) UI [1.2.840.10008.1.2.1] # 20,1 Transfer Syntax UID
    (0002,0012) UI [1.3.12.2.1107.5.2] # 18,1 Implementation Class UID
    (0002,0013) SH [MR_VB20P] # 8,1 Implementation Version Name

    Dicom-Data-Set
    Used TransferSyntax: 1.2.840.10008.1.2.1
    (0008,0005) CS [ISO_IR 100] # 10,1-n Specific Character Set
    (0008,0008) CS [ORIGINAL\PRIMARY\STATIC\AC] # 26,2-n Image Type
    (0008,0012) DA [20161101] # 8,1 Instance Creation Date
    (0008,0013) TM [105330.109000 ] # 14,1 Instance Creation Time
    (0008,0016) UI [1.2.840.10008.5.1.4.1.1.128] # 28,1 SOP Class UID
    (0008,0018) UI [1.3.12.2.1107.5.2.38.51040.2016110110532420312500560] # 52,1 SOP Instance UID
    (0008,0020) DA [20161101] # 8,1 Study Date
    (0008,0021) DA [20161101] # 8,1 Series Date
    (0008,0022) DA [20161101] # 8,1 Acquisition Date
    (0008,0023) DA [20161101] # 8,1 Content Date
    (0008,0030) TM [093452.812000 ] # 14,1 Study Time
    (0008,0031) TM [100405.000000 ] # 14,1 Series Time
    (0008,0032) TM [100405.000000 ] # 14,1 Acquisition Time
    (0008,0033) TM [105330.109000 ] # 14,1 Content Time
    (0008,0050) SH (no value) # 0,1 Accession Number
    (0008,0060) CS [PT] # 2,1 Modality
    (0008,0070) LO [SIEMENS ] # 8,1 Manufacturer
    (0008,0080) LO [Anonymous ] # 10,1 Institution Name
    (0008,0081) ST [Anonymous ] # 10,1 Institution Address
    (0008,0090) PN [Anonymous ] # 10,1 Referring Physician's Name
    (0008,1010) SH [Anonymous ] # 10,1 Station Name
    (0008,1030) LO [XX^XX] # 32,1 Study Description
    (0008,103e) LO [*Abd_MRAC_PET_AC Images ] # 24,1 Series Description
    (0008,1040) LO [Anonymous ] # 10,1 Institutional Department Name
    (0008,1050) PN [Anonymous ] # 10,1-n Performing Physician's Name
    (0008,1090) LO [Biograph_mMR] # 12,1 Manufacturer's Model Name
    (0008,1140) SQ (Sequence with defined length) # 306,1 Referenced Image Sequence
    (fffe,e000) na (Item with defined length)
    (0008,1150) UI [1.2.840.10008.5.1.4.1.1.4] # 26,1 Referenced SOP Class UID
    (0008,1155) UI [1.3.12.2.1107.5.2.38.51040.2016110109493929278108444] # 52,1 Referenced SOP Instance UID
    (fffe,e000) na (Item with defined length)
    (0008,1150) UI [1.2.840.10008.5.1.4.1.1.4] # 26,1 Referenced SOP Class UID
    (0008,1155) UI [1.3.12.2.1107.5.2.38.51040.2016110109564887525712923] # 52,1 Referenced SOP Instance UID
    (fffe,e000) na (Item with defined length)
    (0008,1150) UI [1.2.840.10008.5.1.4.1.1.4] # 26,1 Referenced SOP Class UID
    (0008,1155) UI [1.3.12.2.1107.5.2.38.51040.2016110109482656483807702] # 52,1 Referenced SOP Instance UID
    (0008,1250) SQ (Sequence with defined length) # 150,1 Related Series Sequence
    (fffe,e000) na (Item with defined length)
    (0020,000d) UI [1.3.12.2.1107.5.2.38.51040.30000016103122414446800000004] # 56,1 Study Instance UID
    (0020,000e) UI [1.3.12.2.1107.5.2.38.51040.2016110110021659897623931.0.0.0] # 58,1 Series Instance UID
    (0040,a170) SQ (Sequence with defined length) # 0,1 Purpose of Reference Code Sequence
    (0010,0010) PN [Anonymous ] # 10,1 Patient's Name
    (0010,0020) LO [unknown ] # 8,1 Patient ID
    (0010,0030) DA [XXXXXXXX] # 8,1 Patient's Birth Date
    (0010,0040) CS [X ] # 2,1 Patient's Sex
    (0010,1010) AS [000Y] # 4,1 Patient's Age
    (0010,1020) DS [0.00] # 4,1 Patient's Size
    (0010,1030) DS [00] # 2,1 Patient's Weight
    (0018,0050) DS [2.03125 ] # 8,1 Slice Thickness
    (0018,1000) LO [51040 ] # 6,1 Device Serial Number
    (0018,1020) LO [syngo MR B20P ] # 14,1-n Software Version(s)
    (0018,1030) LO [Abd_MRAC_PET] # 12,1 Protocol Name
    (0018,1181) CS [NONE] # 4,1 Collimator Type
    (0018,1200) DA [20161101] # 8,1-n Date of Last Calibration
    (0018,1201) TM [083427.000000 ] # 14,1-n Time of Last Calibration
    (0018,1210) SH [XYZGAUSSIAN4.00 ] # 16,1-n Convolution Kernel
    (0018,1242) IS [2700000 ] # 8,1 Actual Frame Duration
    (0018,5100) CS [HFS ] # 4,1 Patient Position
    (0020,000d) UI [1.3.12.2.1107.5.2.38.51040.30000016110123080754600001906] # 56,1 Study Instance UID
    (0020,000e) UI [1.3.12.2.1107.5.2.38.51040.2016110110532300000555] # 50,1 Series Instance UID
    (0020,0010) SH [1 ] # 2,1 Study ID
    (0020,0011) IS [51] # 2,1 Series Number
    (0020,0013) IS [1 ] # 2,1 Instance Number
    (0020,0032) DS [-358.49165142605\-359.76148695776\133.46359705925 ] # 50,3 Image Position (Patient)
    (0020,0037) DS [1\0\0\0\1\0 ] # 12,6 Image Orientation (Patient)
    (0020,0052) UI [1.3.12.2.1107.5.2.38.51040.2.20161101094730687.0.0.0] # 52,1 Frame of Reference UID
    (0020,1002) IS [127 ] # 4,1 Images in Acquisition
    (0020,1040) LO (no value) # 0,1 Position Reference Indicator
    (0020,1041) DS [133.464 ] # 8,1 Slice Location
    (0020,4000) LT [Anonymous ] # 10,1 Image Comments
    (0028,0002) US 1 # 2,1 Samples per Pixel
    (0028,0004) CS [MONOCHROME2 ] # 12,1 Photometric Interpretation
    (0028,0010) US 344 # 2,1 Rows
    (0028,0011) US 344 # 2,1 Columns
    (0028,0030) DS [2.08626\2.08626 ] # 16,2 Pixel Spacing
    (0028,0051) CS [NORM\DTIM\MLAAFOV\ATTN\3SCAT\RELSC\DECY\FLEN\RANSM\XYSM\ZSM ] # 60,1-n Corrected Image
    (0028,0100) US 16 # 2,1 Bits Allocated
    (0028,0101) US 16 # 2,1 Bits Stored
    (0028,0102) US 15 # 2,1 High Bit
    (0028,0103) US 0 # 2,1 Pixel Representation
    (0028,0106) US 0 # 2,1 Smallest Image Pixel Value
    (0028,0107) US 5611 # 2,1 Largest Image Pixel Value
    (0028,1050) DS [15041 ] # 6,1-n Window Center
    (0028,1051) DS [30081 ] # 6,1-n Window Width
    (0028,1052) DS [0 ] # 2,1 Rescale Intercept
    (0028,1053) DS [5.3612170781579 ] # 16,1 Rescale Slope
    (0028,1054) LO [US] # 2,1 Rescale Type
    (0028,1055) LO [Algo1 ] # 6,1-n Window Center & Width Explanation
    (0029,0010) LO [SIEMENS CSA HEADER] # 18,1 Private Creator
    (0029,0011) LO [SIEMENS MEDCOM HEADER ] # 22,1 Private Creator
    (0029,0012) LO [SIEMENS MEDCOM HEADER2] # 22,1 Private Creator
    (0029,1008) CS [PET NUM 4 ] # 10,1 CSA Image Header Type
    (0029,1009) LO [20161101] # 8,1 CSA Image Header Version
    (0029,1010) OB 53\56\31\30\04\03\02\01\1c\00\00\00\4d\00\00\00\50\72\6f\74\6f\63\6f\6c\53\6c\69\63\65\4e\75\6d\62\65\72\00\11\11\11\00\12\12\12\00\13\13\13\00\14\14\14\00\15\15\15\00\16\16\16\00\17\17\17\00 # 2888,1 CSA Image Header Info
    (0029,1018) CS [PT] # 2,1 CSA Series Header Type
    (0029,1019) LO [20161101] # 8,1 CSA Series Header Version
    (0029,1020) OB 53\56\31\30\04\03\02\01\26\00\00\00\4d\00\00\00\55\73\65\64\50\61\74\69\65\6e\74\57\65\69\67\68\74\00\00\00\00\02\00\00\6c\01\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\43\73\61\53 # 124932,1 CSA Series Header Info
    (0029,1120) OB 4d\00\45\00\44\00\43\00\4f\00\4d\00\20\00\48\00\49\00\53\00\54\00\4f\00\52\00\59\00\20\00\56\00\31\00\2e\00\31\00\00\00\43\00\73\00\61\00\50\00\61\00\74\00\69\00\65\00\6e\00\74\00\00\00\30\00 # 272,1 MedCom History Information
    (0029,1260) LO [com ] # 4,1 Series Workflow Status
    (0032,1060) LO [XX] # 32,1 Requested Procedure Description
    (0040,0244) DA [20161101] # 8,1 Performed Procedure Step Start Date
    (0040,0245) TM [093452.906000 ] # 14,1 Performed Procedure Step Start Time
    (0040,0253) SH [MR20161101093452] # 16,1 Performed Procedure Step ID
    (0040,0254) LO [XX^XX] # 32,1 Performed Procedure Step Description
    (0054,0013) SQ (Sequence with defined length) # 32,1 Energy Window Range Sequence
    (fffe,e000) na (Item with defined length)
    (0054,0014) DS [430 ] # 4,1 Energy Window Lower Limit
    (0054,0015) DS [610 ] # 4,1 Energy Window Upper Limit
    (0054,0016) SQ (Sequence with defined length) # 178,1 Radiopharmaceutical Information Sequence
    (fffe,e000) na (Item with defined length)
    (0018,0031) LO [Fluorodeoxyglucose] # 18,1 Radiopharmaceutical
    (0018,1071) DS [0 ] # 2,1 Radiopharmaceutical Volume
    (0018,1072) TM [093013.000000 ] # 14,1 Radiopharmaceutical Start Time
    (0018,1074) DS [384800000 ] # 10,1 Radionuclide Total Dose
    (0018,1075) DS [6586.2] # 6,1 Radionuclide Half Life
    (0018,1076) DS [0.97] # 4,1 Radionuclide Positron Fraction
    (0054,0300) SQ (Sequence with defined length) # 56,1 Radionuclide Code Sequence
    (fffe,e000) na (Item with defined length)
    (0008,0100) SH [C-111A1 ] # 8,1 Code Value
    (0008,0102) SH [SRT ] # 4,1 Coding Scheme Designator
    (0008,0104) LO [^18^Fluorine] # 12,1 Code Meaning
    (0054,0081) US 127 # 2,1 Number of Slices
    (0054,0410) SQ (Sequence with defined length) # 0,1 Patient Orientation Code Sequence
    (0054,0414) SQ (Sequence with defined length) # 0,1 Patient Gantry Relationship Code Sequence
    (0054,1000) CS [WHOLE BODY\IMAGE] # 16,2 Series Type
    (0054,1001) CS [BQML] # 4,1 Units
    (0054,1002) CS [EMISSION] # 8,1 Counts Source
    (0054,1100) CS [DLYD] # 4,1 Randoms Correction Method
    (0054,1101) LO [measured] # 8,1 Attenuation Correction Method
    (0054,1102) CS [START ] # 6,1 Decay Correction
    (0054,1103) LO [OP-OSEM3i21s] # 12,1 Reconstruction Method
    (0054,1104) LO [DESCRIPTION ] # 12,1 Detector Lines of Response Used
    (0054,1105) LO [Model-based ] # 12,1 Scatter Correction Method
    (0054,1200) DS [60] # 2,1 Axial Acceptance
    (0054,1201) IS [5\6 ] # 4,2 Axial Mash
    (0054,1300) DS [1318054.1176629 ] # 16,1 Frame Reference Time
    (0054,1321) DS [1.1488] # 6,1 Decay Factor
    (0054,1322) DS [1 ] # 2,1 Dose Calibration Factor
    (0054,1323) DS [42.0091 ] # 8,1 Scatter Fraction Factor
    (0054,1330) US 1 # 2,1 Image Index
    (7fe0,0010) OW 00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00 # 236672,1 Pixel Data
moloney commented 7 years ago

Hello,

Thanks for reporting this and sorry for the delayed response.

I think this bug has sprung from my own hacky workaround here: https://github.com/moloney/dcmstack/blob/master/src/dcmstack/dcmstack.py#L766

I added that crappy workaround since we tend to make heavy use of the program fslview for viewing Nifti images, and it crashes on unsigned integer data...

I guess the least bad option is to check if this overflow is going to happen, and only do the work around if it won't overflow.

ashgillman commented 7 years ago

Ah, that looks like the culprit. Perhaps the type hack should be performed after populating vox_array.

if stack_dtype == np.uint16 and np.all(vox_array > 0):
    vox_array.astype(np.int16)
ashgillman commented 7 years ago

As an aside, could https://github.com/moloney/dcmstack/blob/master/src/dcmstack/dcmstack.py#L794 be replaced with:

vox_array = vox_array.squeeze()

Or are you trying to keep the implementation robust by only squeezing dims 3 and 4?

moloney commented 6 years ago

I think this should be fixed in commit eb042f19bcb6898c95713b86f205d09b5235650b. I took a slightly different approach to your PR and added some tests. Sorry for letting this sit for so long.

pocolya commented 5 years ago

I have the same issue, when converting a PET image to nifti I get negative values and I am using dcmstack version 0.6.2,can it be because of this?

moloney commented 5 years ago

The fix is in version 0.7, which is finally on PYPI (i.e. you can do 'pip install dcmstack').

pocolya commented 5 years ago

Thank you for the reply. Do you think it's possible to have this version in anaconda as well?

pocolya commented 5 years ago

I just looked into the code of the 0.7 version and I can't really see how it's changed, since I still see these lines that are responsible for the negative values :

if stack_dtype == np.uint16 and bits_stored < 16: stack_dtype = np.int16

pocolya commented 5 years ago

I just used the code available online and copied, it is working now, I don't get the negative values, however, I would like to have the slope and intercept taken in consideration, this doesn't seem to exist in Dcmstack, if I see correctly? But then, why when I make a nifti out of my stack (to_nifti plain or with embed_meta=True ), the same voxel value changes? The orientation?

moloney commented 5 years ago

I have no plans to package it for Anaconda (can't you pip install into an Anaconda environment?).

You can pass an empty string to --voxel-order to avoid any reorientation.

I believe the output should be pre-scaled floating point (i.e. slope/intercept transform is already applied).

pocolya commented 5 years ago

Thank you, setting the voxel_order='' helped, now I have the same values in nifti and the dicom stack, which are indeed floating point. And after verification, it is Nibabel that scales the data and it is used in dcmstack. So, in case it might be useful for others, in Nibabel get_pixel_array returns the raw data and get_data gives the scaled data.

pocolya commented 5 years ago

Sorry, I am being stubborn with this. But I found that in Nibabel there is a possibility to get unscaled data (img.dataobj.get_unscaled()). However, it only works for nifti data directly loaded from the disk and not with to_nifti for instance, and what is worse is that it still gives the scaled data. Would it mean that to_nifti does not preserve the unscaled data? How can one access such data with Dcmstack then?

moloney commented 5 years ago

Yeah I noticed when I fixed this bug that dcmstack will never save the unscaled data into the Nifti (copying over the slope/intercept from the DICOM data). I was thinking about adding this in, but didn't have the time. It really would just be a space saving optimization.

I opened a new issue (#62 ) to track this, but fixing it isn't a priority for me. I would accept a well written PR though...

pocolya commented 5 years ago

What is PR? Converting a two-hundred set of DICOMs I see that I still have the negative values problem. I don't know where it comes from considering there is not more bit of the code you used for FSL. I am thinking of using something else (dcm2niix?)

moloney commented 5 years ago

Negative values aren't necessarily indicative of a bug. What are the slope/intercept values in the DICOM files when you get negative values?

pocolya commented 5 years ago

They shouldn't be negative : ('Slope ', 3.0932071993164, 'Intercept ', 0.0) I compared one image pixel values by accessing the image data different ways and getting their dtype, here are some lines to avoid ambiguity and their outputs :

Some conclusions I make out of it :

If you have some explanation on this, or you want an image example and/or the code before, I can provide them. I can make it all with Dcm2niix, as I am pretty sure it works well, even though it's not a very elegant solution to my taste.

A part from this issue, I have some CSA field troubles (Siemens scanners).

Thank you for your help.

moloney commented 5 years ago

I am having a hard time following the code in your comment. Can you share a minimal script and data set that demonstrates the issue?

I don't see any negative values, but maybe there is some sort of precision loss?

I am also not sure why you are including the voxel_order in your permutations, it has no bearing on how the voxel intensities are scaled (it just affects how they are indexed).

It certainly is possible there is still some bug in how I am handling this in dcmstack, and if that is true I would like to fix it.

pocolya commented 5 years ago

I am having a hard time following the code in your comment. Can you share a minimal script and data set that demonstrates the issue?

Sorry, here is the code and one dicom folder attached.

I don't see any negative values, but maybe there is some sort of precision loss?

You can't see them in the pixel value I tested, but they do exist, and as you see, we get int and not unsigned int.

I am also not sure why you are including the voxel_order in your permutations, it has no bearing on how the voxel intensities are scaled (it just affects how they are indexed).

The voxel order was used as I had a different orientation compared to other patient's files (segmentation, masks and so on).

Anonyme.zip

comparePatients.py.zip