Wrong annotations in AnnotatedSliceDataset

facebookresearch / fastMRI

A large-scale dataset of both raw MRI measurements and clinical MRI images.

https://fastmri.org

MIT License

1.28k stars 370 forks source link

Wrong annotations in AnnotatedSliceDataset #309

Closed Duplums closed 1 year ago

Duplums commented 1 year ago

AnnotatedSliceDataset handles metadata attribute from SliceDataset to add specific annotations for each slice. However, currently metadata is a Python dict shared across all slices from the same subject. Hence, currently all slices share the same annotations in AnnotatedSliceDataset. This is not expected and it should not be the case since different slices have distinct annotations.

A simple example to reproduce:

from fastmri.data.mri_data import AnnotatedSliceDataset
dataset = AnnotatedSliceDataset("/data/fastmri/knee/singlecoil_train", "singlecoil", "knee", "first", use_dataset_cache=True)
kspace, mask, target, attrs, fname.name, dataslice = dataset[0]

assert attrs["annotation"]["slice"] != dataslice

mmuckley commented 1 year ago

Hello @Duplums, thanks for raising this issue. I think it was due to the current code relying on in-place modifications of metadata rather than out-of-place. I've opened PR #310. Could you review it and see if it solves your issue?

Duplums commented 1 year ago

Somewhat related to the previous issue, I also noticed that some "y" annotations have negative values (-1) because the way they are computed is wrong (line 579 in mri_data.py):

...
"y": 320 - int(row.y) - int(row.height) - 1,
...

When row.y == 320 - row.height, this is equal to -1 (e.g. file1000307.h5 in knee dataset, slice 37). Here the correct computation is:

"y": 320 - int(row.y) - int(row.height),

Bests

mmuckley commented 1 year ago

Hello @Duplums, could you let me know which sample this is? When I run the following:

print(annotations_csv["y"].max())

the output is 280.0.

Edit: sorry, I don't think I read carefully, looking into this.

mmuckley commented 1 year ago

Hello @Duplums, could you look at PR #311? I modified the calculation to your format. I also simplified the overall annotation code and removed the hardcoded 320, changing it to use the image size from the metadata.

mmuckley commented 1 year ago

Closing as #311 is merged.