Project-MONAI / MONAI

AI Toolkit for Healthcare Imaging
https://monai.io/
Apache License 2.0
5.76k stars 1.06k forks source link

Need to transpose array obtained from itk instead of reversing of itk.size() in ITKReader #1357

Closed take5v closed 3 years ago

take5v commented 3 years ago

Description

In ITKReader, itk.size(img) should not be reversed since it outputs correct size of an image. Instead, itk.array_view_from_image(img, keep_axes=False) provides the transposed array and such numpy array must be transposed because of the differences of storing the data in numpy and itk.

This PR is wrong: https://github.com/Project-MONAI/MONAI/pull/1045. See also https://github.com/InsightSoftwareConsortium/ITK/issues/2006

Expected behavior

  1. Need to remove the reverse for itk.size.
  2. Transpose array obtained from itk.array_view_from_image.
aylward commented 3 years ago

Note that work has been done to now automate the conversion of Pytorch tensors to ITK images. See this commit that happened today:

https://github.com/InsightSoftwareConsortium/ITK/commit/c42436feae23b069c8a5e546c1d7cd4011a79c4b

Additionally, ITK Python now supports 4D images even when those 4D images are actually representing vectors of 3D images or 3D images of vectors.

Much testing is needed, and the MONAI code will perhaps need updating, but I think fixing ITK was the right first step.

@thewtex - Thanks!

thewtex commented 3 years ago

@take5v we need to avoid itk.array_view_from_image(img, keep_axes=False) because it results in data storage that is different from NumPy and PyTorch. We need to avoid unnecessary related performance penalties.

1045 adds the correct reversal in itk.size to correspond to the index ordering of a C-order NumPy array, described in https://github.com/InsightSoftwareConsortium/ITK/issues/2006. ITK uses IJK ordering, which C-order NumPy arrays are KJI.

Nic-Ma commented 3 years ago

Hi @thewtex and @aylward ,

Thanks for your detailed explanation here. Could you please help submit a MR to MONAI to update something once ITK fixed the order? Let's keep this ticket open to track it.

Thanks in advance.

take5v commented 3 years ago

Hi @aylward, @thewtex, @Nic-Ma!

I have made a simple test in order to check ITKReader currently implemented in MONAI. Beforehand, I converted nifty anatomical.nii (availble in MONAI test data) to DICOM.

TEST_CASE = [DICOM_PATH, NIFTY_PATH]

class TestITKLoader(unittest.TestCase):
    @parameterized.expand([TEST_CASE])
    def test_array_shape(self, dicom_path, nifty_path):
        itk_reader = ITKReader()
        itk_img = itk_reader.read(dicom_path)
        itk_array, itk_meta_data = itk_reader.get_data(itk_img)

        nifty_reader = NibabelReader()
        nifty_img = nifty_reader.read(nifty_path)
        nifty_array, nifty_metadata = nifty_reader.get_data(nifty_img)

        self.assertTupleEqual(itk_array.shape, nifty_array.shape)
        self.assertTupleEqual(tuple(
            itk_meta_data['spatial_shape']), tuple(nifty_metadata['spatial_shape']))

Currently this test is failed: itk_array shape is (25, 41, 33) whereas nifty_array shape is (33, 41, 25) itk_meta_data['spatial_shape'] is (25, 41, 33) whereas nifty_metadata['spatial_shape'] is (33, 41, 25)

That was a reason why I suggested to remove shape reversing and to add transposing of itk.array_view_from_image which is, at least, fixes a bug in ITKReader. As @thewtex and @aylward pointed out, it may be a better solution for this.

Please find attached DICOM files of the test image (anatomical.nii): anatomical_DICOM.zip

Additionally, this issue should be marked as a bug.

thewtex commented 3 years ago

@Nic-Ma @take5v just a note that I will take some time to follow up on this over the next few days.

thewtex commented 3 years ago

To follow-up, I made some progress on this, but I will come back to it after I return in the new year. Happy pre-2021!

Nic-Ma commented 3 years ago

Hi @thewtex ,

Have you got a chance to make some progress on this ticket?

Thanks.

thewtex commented 3 years ago

@Nic-Ma I know there is a lot of confusion in the community around indexing, why the order is the way it is, how indexing order relates to convension, memory layout, and performance. And why the order is not the way we sometimes want it to be.

In summary, the order provided by ITKReader is correct. It is consistent with NumPy index conventions, the memory layout, and optimal performance. The current NibabelReader returns a reversed shape that some folks in the neuroimaging community are used to, and this can work when working within the scope of nipy tools. However, it conflicts with most other image processing tools in the ecosystem, e.g. scikit-image, itk, imagej. And, while it may not make a big difference for neuroimaging datasets, which are often nearly anisotropically sampled, it will have unnecessary performance penalties for other imaging datasets, where in-plane resolution is often much higher than out-of-plane resolution, and 2D models are applied.

I am working on documentation and examples that explain this, but it will take some time.

Nic-Ma commented 3 years ago

Hi @thewtex ,

Really appreciate your help here!

Thanks in advance.

brad-t-moore commented 3 years ago

I'm not sure if this is related to a bug I'm currently running into (I haven't submitted an issue yet).

print(img.GetSpacing())
print(img.GetLargestPossibleRegion().GetSize())
type(img)
(55, 512, 512)
itkVectorD3 ([0.976562, 0.976562, 5])
itkSize3 ([512, 512, 55])
itk.itkImagePython.itkImageD3

And then when I'm trying to apply a Spacingd to that image (as loaded through MONAI) it's misapplying. The output is from print statements in the Compose apply transform loop (type(input_), type(transform), input['image'].shape):

val_transforms = Compose(
    [
        LoadImaged(keys=["image", "label"], reader='ITKReader'),
        AddChanneld(keys=["image", "label"]),
        Orientationd(keys=["image", "label"], axcodes="RAS"),
        Spacingd(keys=["image", "label"], pixdim=(1.0, 1.0, 1.0), mode=("bilinear", "nearest")),
        ScaleIntensityRanged(
            keys=["image"], a_min=-57, a_max=164, b_min=0.0, b_max=1.0, clip=True,
        ),
        CropForegroundd(keys=["image", "label"], source_key="image"),
        ToTensord(keys=["image", "label"]),
    ]
)
<class 'dict'>
<class 'monai.transforms.io.dictionary.LoadImaged'>
<class 'dict'>
<class 'monai.transforms.utility.dictionary.AddChanneld'>
image.shape:  (55, 512, 512)
<class 'dict'>
<class 'monai.transforms.spatial.dictionary.Orientationd'>
image.shape:  (1, 55, 512, 512)
<class 'dict'>
<class 'monai.transforms.spatial.dictionary.Spacingd'>
image.shape:  (1, 55, 512, 512)
<class 'dict'>
<class 'monai.transforms.intensity.dictionary.ScaleIntensityRanged'>
image.shape:  (1, 54, 500, 2556)

It's taking that last index and multiplying it by 5 due to the "z-axis" being a originally around 5mm in spacing. I'm calling this a bug because somewhere it's correctly calculating the spacing ratio but then applying it to the wrong index.

This is the state of the input_ in the Compose transform before applying Spacingd:

ipdb>  input_['image'].shape
(1, 55, 512, 512)
ipdb>  input_['image_meta_dict']
{'0008|0016': '1.2.840.10008.5.1.4.1.1.7.3', '0008|0018': '1.2.826.0.1.3680043.2.1125.1.90455018032395752885544164701193819', '0008|0020': '20210128', '0008|0030': '164956.536161 ', '0008|0050': '', '0008|0060': 'OT', '0008|0090': '', '0010|0010': '', '0010|0020': '', '0010|0030': '', '0010|0040': '', '0020|000d': '1.2.826.0.1.3680043.2.1125.1.28882890755346393261644168155624487', '0020|000e': '1.2.826.0.1.3680043.2.1125.1.68989042725981558895947946072747819', '0020|0010': '', '0020|0011': '', '0020|0013': '', '0020|0052': '1.2.826.0.1.3680043.2.1125.1.19359276313777369875433246900418117', '0028|0002': '1', '0028|0004': 'MONOCHROME2 ', '0028|0008': '55', '0028|0009': '(5200,9230)', '0028|0010': '512', '0028|0011': '512', '0028|0100': '16', '0028|0101': '16', '0028|0102': '15', '0028|0103': '0', '0028|1052': '-16383.75 ', '0028|1053': '15.999755859375 ', '0028|1054': 'US', 'origin': array([499.02319336, 499.02319336,   0.        ]), 'spacing': array([0.97656202, 0.97656202, 5.        ]), 'direction': array([[-1.,  0.,  0.],
       [ 0., -1.,  0.],
       [ 0.,  0.,  1.]]), 'original_affine': array([[ -0.97656202,   0.        ,   0.        , 499.02319336],
       [  0.        ,  -0.97656202,   0.        , 499.02319336],
       [  0.        ,   0.        ,   5.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ,   1.        ]]), 'affine': array([[ 9.76562023e-01,  0.00000000e+00,  0.00000000e+00,
         4.46288844e+02],
       [ 0.00000000e+00,  9.76562023e-01,  0.00000000e+00,
        -4.76836760e-07],
       [ 0.00000000e+00,  0.00000000e+00,  5.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.00000000e+00]]), 'spatial_shape': array([ 55, 512, 512]), 'filename_or_obj': './PVM-Girder-TestData/spleen_10.dcm'}

And here it is afterward:

ipdb>  input_['image'].shape
(1, 54, 500, 2556)
ipdb>  input_['image_meta_dict']
{'0008|0016': '1.2.840.10008.5.1.4.1.1.7.3', '0008|0018': '1.2.826.0.1.3680043.2.1125.1.90455018032395752885544164701193819', '0008|0020': '20210128', '0008|0030': '164956.536161 ', '0008|0050': '', '0008|0060': 'OT', '0008|0090': '', '0010|0010': '', '0010|0020': '', '0010|0030': '', '0010|0040': '', '0020|000d': '1.2.826.0.1.3680043.2.1125.1.28882890755346393261644168155624487', '0020|000e': '1.2.826.0.1.3680043.2.1125.1.68989042725981558895947946072747819', '0020|0010': '', '0020|0011': '', '0020|0013': '', '0020|0052': '1.2.826.0.1.3680043.2.1125.1.19359276313777369875433246900418117', '0028|0002': '1', '0028|0004': 'MONOCHROME2 ', '0028|0008': '55', '0028|0009': '(5200,9230)', '0028|0010': '512', '0028|0011': '512', '0028|0100': '16', '0028|0101': '16', '0028|0102': '15', '0028|0103': '0', '0028|1052': '-16383.75 ', '0028|1053': '15.999755859375 ', '0028|1054': 'US', 'origin': array([499.02319336, 499.02319336,   0.        ]), 'spacing': array([0.97656202, 0.97656202, 5.        ]), 'direction': array([[-1.,  0.,  0.],
       [ 0., -1.,  0.],
       [ 0.,  0.,  1.]]), 'original_affine': array([[ -0.97656202,   0.        ,   0.        , 499.02319336],
       [  0.        ,  -0.97656202,   0.        , 499.02319336],
       [  0.        ,   0.        ,   5.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ,   1.        ]]), 'affine': array([[ 1.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         4.46288844e+02],
       [ 0.00000000e+00,  1.00000000e+00,  0.00000000e+00,
        -4.76836760e-07],
       [ 0.00000000e+00,  0.00000000e+00,  1.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.00000000e+00]]), 'spatial_shape': array([ 55, 512, 512]), 'filename_or_obj': './PVM-Girder-TestData/spleen_10.dcm'}

@thewtex

EDIT: this is using the release packages via pip (itk 5.1.2, monai 0.4.0)

thewtex commented 3 years ago

@brad-t-moore original_affine is not correct in this case.

print(img.GetSpacing()) itkVectorD3 ([0.976562, 0.976562, 5])

This is ITK, Fortran order.

image.shape: (55, 512, 512)

This is NumPy, C order.

But the affine matrix is not in NumPy C order.

'original_affine': array([[ -0.97656202,   0.        ,   0.        , 499.02319336],
       [  0.        ,  -0.97656202,   0.        , 499.02319336],
       [  0.        ,   0.        ,   5.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ,   1.        ]]
thewtex commented 3 years ago

this is using the release packages via pip (itk 5.1.2, monai 0.4.0)

We will soon release ITK 5.2, RC 2, which will make it possible to simplify the MONAI code. I will up update MONAI and look into this original_affine issue.

aylward commented 3 years ago

Should try updating to ITK 5.2 rc2 or greater.

brad-t-moore commented 3 years ago

@thewtex @aylward still seeing this issue in ITK 5.2-RC3. I need to finish what I'm working on but I'll dig into troubleshooting it.

brad-t-moore commented 3 years ago

Also using monai-weekly

dzenanz commented 3 years ago

I plan to work on this issue soon - in case somebody is already working on it or planned to.

brad-t-moore commented 3 years ago

Well, I started getting into it but it's a matter of timing right now. The things I wanted to focus on were:

  1. Making sure the affine matrix is RAS and not LPS. The Orientation transform in MONAI seems to assume all data is loaded as RAS before using.
  2. The best way to make sure everything gets into CHWD[T] on the final tensor. My current reading of the MONAI documentation (someone correct me) is that you ought to have a SpacingTransform with diagonal=True if you have your data to eventually be co-aligned.
  3. Along the lines of #2. The Channel dimension doesn't seem to be considered in the affine, i.e., it better be the first dimension from the get-go? I was scraping together some test cases trying to understand how 4D VectorImages or 5D images would be stored. It seems that we do a good job of making the IO create VectorImage's of mult-channel 4D data (as opposed to 5D images)?
  4. ITK's own OrientImageFilter seems, in the documentation, to mis-define what R and L mean: "ITK_COORDINATE_ORIENTATION_RIP: Right to Left varies fastest (0th pixel on Subject's right)", but at least in the DICOM format the R/L denote patient orientation going towards positive-infinity (I read 0th pixel on the subjects right as an L orientation, not R as described). It doesn't practically matter as that filter is relative to the user-specified input orientation (which OrientImageFilter unfortunately has a default value for input orientation that isn't the ITK LPS).
  5. Speed! If we get this fixed, I'd like to see a speed comparison with the NibabelReader on Nifty images.

On Tue, May 25, 2021 at 6:12 PM Dženan Zukić @.***> wrote:

I plan to work on this issue soon - in case somebody is already working on it or planned to.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/1357#issuecomment-848304221, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOGPH7EPOLSALM7XQGM5TH3TPQOGVANCNFSM4UWX6JTA .

-- Brad T. Moore, Ph.D. Staff R&D Engineer, Medical Computing Kitware, Inc.

http://www.kitware.com

brad-t-moore commented 3 years ago

Oh and 6. Someone confirming the edge case of a negative Index on LargestPossibleRegion(). My understanding is that ITK's Origin is defined as the physical point when Index = 0 (notoriously caused by the PadImageFilter). I think Nifty and DICOM usually define Origin (or whatever they are calling it) as the first element in the memory buffer.

On Wed, May 26, 2021 at 10:38 AM Brad Moore @.***> wrote:

Well, I started getting into it but it's a matter of timing right now. The things I wanted to focus on were:

  1. Making sure the affine matrix is RAS and not LPS. The Orientation transform in MONAI seems to assume all data is loaded as RAS before using.
  2. The best way to make sure everything gets into CHWD[T] on the final tensor. My current reading of the MONAI documentation (someone correct me) is that you ought to have a SpacingTransform with diagonal=True if you have your data to eventually be co-aligned.
  3. Along the lines of #2. The Channel dimension doesn't seem to be considered in the affine, i.e., it better be the first dimension from the get-go? I was scraping together some test cases trying to understand how 4D VectorImages or 5D images would be stored. It seems that we do a good job of making the IO create VectorImage's of mult-channel 4D data (as opposed to 5D images)?
  4. ITK's own OrientImageFilter seems, in the documentation, to mis-define what R and L mean: "ITK_COORDINATE_ORIENTATION_RIP: Right to Left varies fastest (0th pixel on Subject's right)", but at least in the DICOM format the R/L denote patient orientation going towards positive-infinity (I read 0th pixel on the subjects right as an L orientation, not R as described). It doesn't practically matter as that filter is relative to the user-specified input orientation (which OrientImageFilter unfortunately has a default value for input orientation that isn't the ITK LPS).
  5. Speed! If we get this fixed, I'd like to see a speed comparison with the NibabelReader on Nifty images.

On Tue, May 25, 2021 at 6:12 PM Dženan Zukić @.***> wrote:

I plan to work on this issue soon - in case somebody is already working on it or planned to.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAI/issues/1357#issuecomment-848304221, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOGPH7EPOLSALM7XQGM5TH3TPQOGVANCNFSM4UWX6JTA .

-- Brad T. Moore, Ph.D. Staff R&D Engineer, Medical Computing Kitware, Inc.

http://www.kitware.com

-- Brad T. Moore, Ph.D. Staff R&D Engineer, Medical Computing Kitware, Inc.

http://www.kitware.com

dzenanz commented 3 years ago

ITK_COORDINATE_ORIENTATION_RIP is equivalent to DICOM LPS: Right to Left varies fastest (0th pixel on Subject's right).

Image's origin corresponds to index 0, in case the LargestPossibleRegion's index is different from 0.

If you already started, please keep going. I wrote that in order to avoid duplication of work. Thanks for replying in a timely fashion.

wyli commented 3 years ago

original_affine is not correct in this case.

print(img.GetSpacing()) itkVectorD3 ([0.976562, 0.976562, 5])

This is ITK, Fortran order.

image.shape: (55, 512, 512)

This is NumPy, C order.

But the affine matrix is not in NumPy C order.

'original_affine': array([[ -0.97656202,   0.        ,   0.        , 499.02319336],
       [  0.        ,  -0.97656202,   0.        , 499.02319336],
       [  0.        ,   0.        ,   5.        ,   0.        ],
       [  0.        ,   0.        ,   0.        ,   1.        ]]

Hi @thewtex, I'm using itk 5.2.0.post3, could you suggest the correct way of setting the affine? the current impl. is: https://github.com/Project-MONAI/MONAI/blob/af0d51be9e484e0fc9e2c4deb270b79bfb1f5b4e/monai/data/image_reader.py#L265-L272