facebookresearch / fastMRI

A large-scale dataset of both raw MRI measurements and clinical MRI images.
https://fastmri.org
MIT License
1.28k stars 371 forks source link

Problems with AnnotatedSliceDataset class #275

Closed NikolasMorshuis closed 1 year ago

NikolasMorshuis commented 1 year ago

Hi, there are two problems with the AnnotatedSliceDataset class:

1.: The annotation_version is an optional argument in the class. In the description it says:

Default value is None, then the latest version will be used.

However, if we do not specify the version and leave the default value None, no annotation data will be fetched. Either the description should be adjusted or the program should fetch the most recent data.

Minimal code to reproduce the issue:

from fastmri.data.mri_data import AnnotatedSliceDataset
import os

data_root = 'path/to/fastMRI/dir'
annotated_slice_dataset = AnnotatedSliceDataset(
    root=os.path.join(data_root, 'multicoil_val'),
    challenge="multicoil",
    subsplit="knee",
    multiple_annotation_policy='all'
)

2.: If we provide a annotation_version, another issue occurs:

  File "/home/nikolas/Documents/reconstruction/fastmri_bug/fastMRI/annotated_dataset_tests.py", line 6, in <module>
    annotated_slice_dataset = AnnotatedSliceDataset(
  File "/home/nikolas/Documents/reconstruction/fastmri_bug/fastMRI/fastmri/data/mri_data.py", line 510, in __init__
    annotation = self.get_annotation(False, rows)
  File "/home/nikolas/Documents/reconstruction/fastmri_bug/fastMRI/fastmri/data/mri_data.py", line 562, in get_annotation
    "x": int(row.x),
ValueError: cannot convert float NaN to integer

The problem is that some lines in the annotation dataset have NaN values for x and hence we get the error message when trying to take int(NaN). An example of such a line can be seen in the following example (last line):

file1001014,20,No,56,93,14,67,Ligament - MCL Low-Mod Grade Sprain
file1001022,0,Yes,,,,,artifact

Minimal working example to reproduce the issue:

from fastmri.data.mri_data import AnnotatedSliceDataset
import os

data_root = 'path/to/fastMRI/dir'
annotated_slice_dataset = AnnotatedSliceDataset(
    root=os.path.join(data_root, 'multicoil_val'),
    challenge="multicoil",
    subsplit="knee",
    annotation_version="640500fb",
    multiple_annotation_policy='all'
)
mmuckley commented 1 year ago

Hello @NikolasMorshuis, thanks very much for raising these issues. I'm unable to work on it at the moment, but if you have some ideas for a quick fix, the code is here you'd be more than welcome to submit a PR.

CC also @Gaskell-1206.

NikolasMorshuis commented 1 year ago

Hi @mmuckley, thank you for your quick response, I have created a pull request that should fix the issues.

mmuckley commented 1 year ago

Fixed by PR #276.