NKI-AI / ahcore

Ahcore is the AI for Oncology core computational pathology toolkit
Apache License 2.0
15 stars 1 forks source link

improvement: reduce setup time of AbstractWriterCallback #88

Open YoniSchirris opened 5 months ago

YoniSchirris commented 5 months ago

Describe the bug When running inference, AbstractWriterCallback loops over all datasets to construct the _dataset_size dict. This opens a slide from cache several times, which can take 1-3 seconds. For a dataset of 1500 wsis this often takes 20 minutes.

To Reproduce Run inference on-the-fly (#87) with your data_dir and glob_pattern set up to find many whole-slide images.

Expected behavior You'll find that after printing the dataset statistics, it takes a long time to start setting up callback workers.

In my case

[2024-06-07 12:24:32,332][ahcore.data.dataset.DlupDataModule][INFO] - Dataset for stage predict has 773079 samples and the following statistics:
 - Mean: 485.30
 - Std: 145.56
 - Min: 48.00
 - Max: 1056.00
[2024-06-07 12:29:30,294][ahcore.callbacks.converters.common][INFO] - Starting worker for TiffConverterCallback

Environment dlup version: 0.3.38 How installed: unsure Python version: 3.11.9 Operating System: linux

Quick solution to reduce time by half; in https://github.com/NKI-AI/ahcore/blob/93274e5ed0859813011b81979367189a0b80a932/ahcore/callbacks/abstract_writer_callback.py#L181 change

assert current_dataset.slide_image.identifier
self._dataset_sizes[current_dataset.slide_image.identifier] = len(current_dataset)

to

current_dataset_slide_id = current_dataset.slide_image.identifier
assert current_dataset_slide_id
self._dataset_sizes[current_dataset_slide_id] = len(current_dataset)

which will likely reduce the time by half

YoniSchirris commented 5 months ago

I thought about this a bit more:

If, in the future, we want to support identifier WITHIN this class, this can be considered a feature request that requires some more refactoring.

YoniSchirris commented 5 months ago
[2024-06-12 12:00:19,387][ahcore.data.dataset.DlupDataModule][INFO] - Dataset for stage predict has 773079 samples and the following statistics:
 - Mean: 485.30
 - Std: 145.56
 - Min: 48.00
 - Max: 1056.00
[2024-06-12 12:00:19,393][ahcore.callbacks.abstract_writer_callback][DEBUG] - Prediction epoch start
[2024-06-12 12:00:19,416][ahcore.callbacks.converters.common][INFO] - Starting worker for TiffConverterCallback
[2024-06-12 12:00:19,432][ahcore.callbacks.converters.common][INFO] - Starting worker for TiffConverterCallback
[2024-06-12 12:00:19,442][ahcore.callbacks.converters.common][DEBUG] - Workers started.
[2024-06-12 12:00:19,447][ahcore.callbacks.converters.common][INFO] - Starting worker for TiffConverterCallback

this fixes this slowness as seen above. Whenever the dataset is loaded, the tiffwriter is immediately ready to go and inference starts

YoniSchirris commented 5 months ago

fixed here https://github.com/NKI-AI/ahcore/pull/87/commits/b1f747e0c8e44b52eb88cb819e9bc4e53db97684 in #87