Project-MONAI / MONAI

AI Toolkit for Healthcare Imaging
https://monai.io/
Apache License 2.0
5.71k stars 1.04k forks source link

CropForegroundd only works for image with same size and orientation #3182

Closed Spenhouet closed 8 months ago

Spenhouet commented 2 years ago

Describe the bug The bounding box for the crop is not projected into the space of the target image. This leads to wrong crops if the target image has another FoV or orientation than the image where the bounding box was taken.

To Reproduce

  1. Download the following two images: seg.nii.gz, thalamic.nii.gz
  2. Run the code below. This assumes that the fix proposed in https://github.com/Project-MONAI/MONAI/issues/3167 is used.
    
    from pathlib import Path

import numpy as np from monai.transforms.compose import Compose

Use the fixed transform as proposed in #3167

from path.to.the.fixed.transform import CropForegroundd from monai.transforms.io.dictionary import LoadImaged, SaveImaged from monai.transforms.utility.dictionary import EnsureChannelFirstd, ToTensord

path = Path('scratch/monai_crop_bug')

SEG_KEY = 'seg' THALA_KEY = 'thalamic' FILE_KEYS = [SEG_KEY, THALA_KEY]

data = { SEG_KEY: path / 'seg.nii.gz', THALA_KEY: path / 'thalamic.nii.gz', }

margin = 20

process = Compose([ LoadImaged(FILE_KEYS), ToTensord(FILE_KEYS), EnsureChannelFirstd(FILE_KEYS), CropForegroundd(FILE_KEYS, source_key=SEG_KEY, margin=margin), SaveImaged( FILE_KEYS, output_dir='scratch/monai_crop_bug', outputpostfix=f"crop{margin}", resample=False, output_dtype=np.int8, separate_folder=False, ), ])

results = process([data])


3. You should get: [thalamic_crop_20.nii.gz](https://github.com/Project-MONAI/MONAI/files/7408872/thalamic_crop_20.nii.gz)
The white area is the original and the green area the area after the crop. It should not have been cropped (it should still show the whole area).
![wrong crop](https://user-images.githubusercontent.com/7819068/138670547-1a1e9ea7-b853-4ad3-9caf-b4add4e126f7.png)

**Expected behavior**

The crop works for combinations of images of all sizes and orientations.

The best solution is probably so project the bounding box into the same space of the target image.
I'm not yet 100% sure on how this could be implemented but I guess it would be something like this:

1. Project the bounding box into mm space of the target image using the target image's affine.
2. Somehow project the bounding box back into voxel space of the target image. <- I'm unsure how to do this step.

**Environment**

<details>
  <summary>Click to view the environment</summary>

Printing MONAI config...

MONAI version: 0.7.0
Numpy version: 1.19.2
Pytorch version: 1.8.1+cu111
MONAI flags: HAS_EXT = False, USE_COMPILED = False
MONAI rev id: bfa054b9c3064628a21f4c35bbe3132964e91f43

Optional dependencies:
Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: 3.1.1
scikit-image version: 0.18.0
Pillow version: 8.3.2
Tensorboard version: 2.7.0
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.62.3
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: NOT INSTALLED or UNKNOWN VERSION.
pandas version: 1.2.4
einops version: NOT INSTALLED or UNKNOWN VERSION.
transformers version: NOT INSTALLED or UNKNOWN VERSION.

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

Printing system config...

System: Windows
Win32 version: ('10', '10.0.19041', 'SP0', '')
Platform: Windows-10-10.0.19041-SP0
Processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
Machine: AMD64
Python version: 3.7.7
Process name: python.exe
Command: ['C:\\Users\\SebastianPenhouet\\AppData\\Local\\Programs\\Python\\Python37\\python.exe', '-c', 'import monai; monai.config.print_debug_info()']
Open files: [popenfile(path='C:\\Windows\\System32\\de-DE\\KernelBase.dll.mui', fd=-1), popenfile(path='C:\\Windows\\System32\\de-DE\\kernel32.dll.mui', fd=-1)]
Num physical CPUs: 6
Num logical CPUs: 12
Num usable CPUs: 12
CPU usage (%): [16.3, 11.8, 26.1, 20.3, 12.4, 7.2, 17.6, 16.3, 15.0, 9.2, 12.4, 58.2]
CPU freq. (MHz): 3501
Load avg. in last 1, 5, 15 mins (%): [0.0, 0.0, 0.0]
Disk usage (%): 97.8
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 31.9
Available memory (GB): 18.6
Used memory (GB): 13.3

Printing GPU config...

Num GPUs: 1
Has CUDA: True
CUDA version: 11.1
cuDNN enabled: True
cuDNN version: 8005
Current device: 0
Library compiled for CUDA architectures: ['sm_37', 'sm_50', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'compute_37']
GPU 0 Name: Quadro K2200
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 5
GPU 0 Total memory (GB): 4.0
GPU 0 CUDA capability (maj.min): 5.0

</details>
Spenhouet commented 2 years ago

It took me some time but this can be done by

  1. Getting all corners of the 3D bounding box.
  2. Projecting all corners into the voxel space of the target image but first projecting into the mm space and then projecting into the target voxel space with the inverse of the target affine.
  3. Get min. and max. for all corner coordinates to find the new bounding box.
  4. Round and convert the bounding box values to make them discrete.
  5. Clip the bounding box by the start and end of the target image.

In addition to the changes proposed in https://github.com/Project-MONAI/MONAI/issues/3167, I made the following changes to make it work:

def get_box(arr: NdarrayOrTensor) -> np.ndarray:
    """For an image get an array of n-dimensional start and end points."""
    bottom_corner = np.array(arr.shape)[-3:]  # Only use coord dims
    return np.array([np.zeros_like(bottom_corner), bottom_corner - 1])

def get_corners(border_box_points: np.ndarray) -> np.ndarray:
    return np.array(list(itertools.product(*zip(*border_box_points))))

def project_points(points: np.ndarray, affine_origin: np.ndarray, affine_target: np.ndarray):
    points_ = np.insert(points, points.shape[-1], 1, axis=-1)
    inv_affine_target = np.linalg.inv(affine_target)
    projection = inv_affine_target @ affine_origin
    return (points_ @ projection.T)[:, :-1]

def minmax(data, axis=None) -> np.ndarray:
    return np.array([np.min(data, axis=axis), np.max(data, axis=axis)])

def project_box(box: np.ndarray,
                affine_origin: np.ndarray,
                affine_target: np.ndarray,
                clip_box: Optional[np.ndarray] = None,
                discreate: bool = False) -> np.ndarray:
    corners = get_corners(box)
    projected_points = project_points(corners, affine_origin, affine_target)
    projected_box = minmax(projected_points, axis=0)

    if discreate:
        projected_box = np.round(projected_box).astype(np.int32)

    if clip_box is not None:
        projected_box = np.clip(projected_box, a_min=clip_box[0], a_max=clip_box[1])

    return projected_box

class CropForegroundd(MapTransform, InvertibleTransform):
    """
    Dictionary-based version :py:class:`monai.transforms.CropForeground`.
    Crop only the foreground object of the expected images.
    The typical usage is to help training and evaluation if the valid part is small in the whole medical image.
    The valid part can be determined by any field in the data with `source_key`, for example:
    - Select values > 0 in image field as the foreground and crop on all fields specified by `keys`.
    - Select label = 3 in label field as the foreground to crop on all fields specified by `keys`.
    - Select label > 0 in the third channel of a One-Hot label field as the foreground to crop all `keys` fields.
    Users can define arbitrary function to select expected foreground from the whole source image or specified
    channels. And it can also add margin to every dim of the bounding box of foreground object.
    """

    def __init__(
        self,
        keys: KeysCollection,
        source_key: str,
        select_fn: Callable = is_positive,
        channel_indices: Optional[IndexSelection] = None,
        margin: Union[Sequence[int], int] = 0,
        k_divisible: Union[Sequence[int], int] = 1,
        mode: NumpyPadModeSequence = NumpyPadMode.CONSTANT,
        start_coord_key: str = "foreground_start_coord",
        end_coord_key: str = "foreground_end_coord",
        allow_missing_keys: bool = False,
        **np_kwargs,
    ) -> None:
        """
        Args:
            keys: keys of the corresponding items to be transformed.
                See also: :py:class:`monai.transforms.compose.MapTransform`
            source_key: data source to generate the bounding box of foreground, can be image or label, etc.
            select_fn: function to select expected foreground, default is to select values > 0.
            channel_indices: if defined, select foreground only on the specified channels
                of image. if None, select foreground on the whole image.
            margin: add margin value to spatial dims of the bounding box, if only 1 value provided, use it for all dims.
            k_divisible: make each spatial dimension to be divisible by k, default to 1.
                if `k_divisible` is an int, the same `k` be applied to all the input spatial dimensions.
            mode: padding mode {``"constant"``, ``"edge"``, ``"linear_ramp"``, ``"maximum"``, ``"mean"``,
                ``"median"``, ``"minimum"``, ``"reflect"``, ``"symmetric"``, ``"wrap"``, ``"empty"``}
                one of the listed string values or a user supplied function. Defaults to ``"constant"``.
                see also: https://numpy.org/doc/1.18/reference/generated/numpy.pad.html
                it also can be a sequence of string, each element corresponds to a key in ``keys``.
            start_coord_key: key to record the start coordinate of spatial bounding box for foreground.
            end_coord_key: key to record the end coordinate of spatial bounding box for foreground.
            allow_missing_keys: don't raise exception if key is missing.
            np_kwargs: other args for `np.pad` API, note that `np.pad` treats channel dimension as the first dimension.
                more details: https://numpy.org/doc/1.18/reference/generated/numpy.pad.html

        """
        super().__init__(keys, allow_missing_keys)
        self.source_key = source_key
        self.start_coord_key = start_coord_key
        self.end_coord_key = end_coord_key
        self.cropper = CropForeground(
            select_fn=select_fn,
            channel_indices=channel_indices,
            margin=margin,
            k_divisible=k_divisible,
            **np_kwargs,
        )
        self.mode = ensure_tuple_rep(mode, len(self.keys))

    def __call__(self, data: Mapping[Hashable, np.ndarray]) -> Dict[Hashable, np.ndarray]:
        d = dict(data)
        img: np.ndarray
        img, *_ = convert_data_type(d[self.source_key], np.ndarray)  # type: ignore
        source_meta_key = f"{self.source_key}_meta_dict"
        bounding_box = self.cropper.compute_bounding_box(img)
        source_affine = d[source_meta_key]['affine']
        d[self.start_coord_key] = bounding_box[0]
        d[self.end_coord_key] = bounding_box[1]
        for key, m in self.key_iterator(d, self.mode):
            meta_key = f"{key}_meta_dict"
            box = project_box(
                np.array(bounding_box),
                affine_origin=source_affine,
                affine_target=d[meta_key]['affine'],
                clip_box=get_box(d[key]),
                discreate=True,
            )
            self.push_transform(d, key, extra_info={"box_start": box[0], "box_end": box[1]})
            d[key], d[meta_key] = self.cropper(img=d[key],
                                               box_start=box[0],
                                               box_end=box[1],
                                               mode=m,
                                               meta_data=d[meta_key])

        return d

Now the results are correct.

image

Nic-Ma commented 2 years ago

Hi @Spenhouet ,

Thanks for your feedback! I think that's expected behavior of CropForegroundd which requires input images have the same orientation and size. About how to align image and label images before transform, I think that's a topic we are discussing in another ticket.

Thanks.

Spenhouet commented 2 years ago

Hi @Nic-Ma,

in our case we can not align images before the CropForegroundd transform due to the image size. Aligning them prior to the crop would not fit into the RAM.

Why do you call this "expected behavior" when I clearly show that the current function does not work and outputs wrong results and I even already provide a fix. Also why is this labeled "question" while it is clearly a bug? I noticed this on my other bug reports aswell.

What speaks against integrating my fix?

Nic-Ma commented 2 years ago

Hi @Spenhouet ,

Sorry maybe I didn't make it clear, the "expected behavior" I mean that CropForegroundd expects input images with same size and orientation, etc. I think we try to make every transform simple and straight-forward to only perform 1 independent thing if possible, so I would suggest to add other transform to align images before CropForegroundd instead of adding more logic to this existing transform. CC @ericspod @wyli for further discussion.

Thanks in advance.

Spenhouet commented 2 years ago

@Nic-Ma thanks for the clarification.

I get the desire for doing only one thing. My fix does not change that. With my fix the transform still performs only one thing. Maybe there is some misunderstanding. My fix does not align the images, it aligns the bounding boxes. It still then only performs the crop with the bounding boxes. The only difference is, that the bounding boxes actually fit to the target image space (which they currently not).

ericspod commented 2 years ago

I think I understand the issue here and the fix though @rijobro would be best to consult with on transforms when he's back.

rijobro commented 2 years ago

Possibly related -- https://github.com/Project-MONAI/MONAI/discussions/3319 and https://github.com/Project-MONAI/MONAI/discussions/3320? If so, it seems there are multiple places in MONAI transforms in which inputting different sized images/labels will cause problems. As much as I like the suggested code change in this issue, I'm not sure we're planning on committing to the large undertaking of updating all MONAI transforms to support different sized images/labels. Perhaps better to make the hard requirement that image/label dimensions match (and perhaps make this clearer in our documentation).

As a potential solution to your problem @Spenhouet, what about a transform that crops images so that they match the smallest image? If the input images were the same size, I think that would solve all your problems.

Spenhouet commented 2 years ago

Hi @rijobro,

I don't think our use case is clear. For our data we need this. There is no way around it. Prior resampling is not possible since it would not fit the ram. The individual images are from different areas of the image and therefore do mostly not overlap.

I would not see this as this large undertaking. Why not just fix the ones that are reported. This could also be a community driven improvement (as with this issue and the implementation provided by me). Also we are extensively using MONAI and on most methods this is not an issue. So as it currently stands, there are 3 transforms which would need an adjustment. This does not sound so bad to me (or like a huge commitment).

My suggested change just makes the code more generic / applicable. I therefore do not see a downside to it. Btw. before there is something implemented, please ping me. I believe we made further internal code changes which I could sync here.

vikashg commented 8 months ago

out of scope according to Richard Brown

Spenhouet commented 8 months ago

Sad to hear.