Project-MONAI / MONAI

AI Toolkit for Healthcare Imaging
https://monai.io/
Apache License 2.0
5.5k stars 1.01k forks source link

Load data repeatedly in class CSVIterableDataset from monai/data/iterable_dataset.py #7869

Open Colwzq opened 5 days ago

Colwzq commented 5 days ago

Describe the bug In iterable_dataset.py

def __iter__(self):
    if self.shuffle:
        self.seed += 1
        buffer = ShuffleBuffer(
            data=self._flattened(), transform=self.transform, buffer_size=self.buffer_size, seed=self.seed
        )
        yield from buffer
    yield from IterableDataset(data=self._flattened(), transform=self.transform)

This may cause loading data repeatedly, and maybe the code should be

def __iter__(self):
    if self.shuffle:
        self.seed += 1
        buffer = ShuffleBuffer(
            data=self._flattened(), transform=self.transform, buffer_size=self.buffer_size, seed=self.seed
        )
        yield from buffer
   else:
        yield from IterableDataset(data=self._flattened(), transform=self.transform)

If I understand the logic in the code incorrectly, please point it out directly, thanks a lot. To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Install '....'
  3. Run commands '....'

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Environment

Ensuring you use the relevant python executable, please paste the output of:

python -c "import monai; monai.config.print_debug_info()"

Additional context Add any other context about the problem here.