TIO-IKIM / CellViT

CellViT: Vision Transformers for Precise Cell Segmentation and Classification
https://doi.org/10.1016/j.media.2024.103143
Other
189 stars 27 forks source link

Tiled prediction does not return proper results for all tiles #30

Closed JLrumberger closed 8 months ago

JLrumberger commented 8 months ago

Describe the bug I did tiled prediction on a TMA slide with the CellViT-256-x40.pth model and there are predictions from some tiles completely missing. Here is the output where I highlighted all cell detections in yellow: cellvit_results

The tissue masks look reasonable (see below): tissue_grid

I looked into the generated patches and it seems that the patches are indeed missing, even though they show up on the tissue mask.

To Reproduce Steps to reproduce the behavior:

  1. Command
    
    python preprocessing\patch_extraction\main_extraction.py --wsi_paths "E:\data\he_data" --output_path "E:\CellViT\output" --wsi_extension ndpi --patch_size 1024 --patch_overlap 6.25

python cell_segmentation\inference\cell_detection.py --model "E:\CellViT\models\pretrained\CellViT-256-x40.pth" --magnification 40 --gpu 0 --batch_size 4 --geojson process_wsi --wsi_path "E:\data\he_data\slide.ndpi" --patched_slide_path "E:\CellViT\output\slide"


2. File
I cannot share the original file, but this error might be reproducible with other TMA slides.

3. Error Traceback
No error 
FabianHoerst commented 8 months ago

Could you maybe try to extract all patches and see if they are present with the command: python preprocessing\patch_extraction\main_extraction.py --wsi_paths "E:\data\he_data" --output_path "E:\CellViT\output" --wsi_extension ndpi --patch_size 1024 --patch_overlap 6.25 --min_intersection_ratio 0 Then, rerun the inference again. I am not sure where the error originates from, as we have not observed a similar behavior yet. We also rarely tested ndpi slides, maybe this could be an error cause.

Are you able to anonymize and share the log file with me?

JLrumberger commented 8 months ago

I works with --min_intersection_ratio 0 but then it extracts way to many patches (4200 instead of ~1500) and runtime will be too high for my usecase. When I set --min_intersection_ratio 0.0001 or just use the default, it extracts patches (without tissue) that are not highlighted on the tissue_masks/tissue_grid.png, so it looks like there is some confusion on the patch coordinates. I can share the logfile with you. You want the logfile from --min_intersection_ratio 0 or --min_intersection_ratio 0.0001?

FabianHoerst commented 8 months ago

Ok that seems strange, I am going to test this with an example .ndpi file. Please send me the file with 0.0001

JLrumberger commented 8 months ago

logfile with --log_level debug

2023-11-21 16:40:13,167 [DEBUG] - Parsed CLI without errors. Logger instantiated.
2023-11-21 16:40:13,173 [DEBUG] - Stored config under: E:\CellViT\output\config.yaml
2023-11-21 16:40:13,176 [INFO] - Using OpenSlide
2023-11-21 16:40:13,177 [INFO] - Data store directory: E:\CellViT\output
2023-11-21 16:40:13,177 [INFO] - Images found: 1
2023-11-21 16:40:13,178 [INFO] - Annotations found: 0
2023-11-21 16:40:13,178 [INFO] - Empty output folder. Processing all files
2023-11-21 16:40:13,182 [INFO] - *************************************************************************************************************************************************************************************************************************************************************
2023-11-21 16:40:13,182 [INFO] - 1/1: slide.ndpi
2023-11-21 16:40:13,183 [INFO] - Computing patches for slide.ndpi
2023-11-21 16:40:13,244 [INFO] - Generate thumbnails
2023-11-21 16:40:13,245 [DEBUG] - Save thumbnails of image at different scales: [32, 64, 128]
2023-11-21 16:40:18,502 [WARNING] - The given patch size 960 is not a power of 2. For a better background selection please consider using a power of 2.
2023-11-21 16:40:21,203 [INFO] - slide.ndpi: Processing 1566 patches.
2023-11-21 16:40:27,995 [INFO] - Start extracting patches...
2023-11-21 16:40:27,996 [DEBUG] - Started process MainProcess
2023-11-21 16:40:45,193 [DEBUG] - Padding Tile
2023-11-21 16:40:50,013 [DEBUG] - Padding Tile
2023-11-21 16:40:53,822 [DEBUG] - Padding Tile
2023-11-21 16:40:56,828 [DEBUG] - Padding Tile
2023-11-21 16:40:59,125 [DEBUG] - Padding Tile
2023-11-21 16:41:00,203 [DEBUG] - Padding Tile
2023-11-21 16:41:00,950 [DEBUG] - Padding Tile
2023-11-21 16:41:16,909 [DEBUG] - Padding Tile
2023-11-21 16:41:19,616 [DEBUG] - Padding Tile
2023-11-21 16:41:22,131 [DEBUG] - Padding Tile
2023-11-21 16:41:24,361 [DEBUG] - Padding Tile
2023-11-21 16:41:26,314 [DEBUG] - Padding Tile
2023-11-21 16:42:38,377 [DEBUG] - Padding Tile
2023-11-21 16:42:38,438 [DEBUG] - Padding Tile
2023-11-21 16:42:38,495 [DEBUG] - Padding Tile
2023-11-21 16:42:38,541 [DEBUG] - Padding Tile
2023-11-21 16:42:38,595 [DEBUG] - Padding Tile
2023-11-21 16:42:38,707 [DEBUG] - Padding Tile
2023-11-21 16:42:38,759 [DEBUG] - Padding Tile
2023-11-21 16:42:38,813 [DEBUG] - Padding Tile
2023-11-21 16:42:40,346 [INFO] - Finished Processing and Storing. Took:
2023-11-21 16:42:40,347 [INFO] - Time usage: 0:02:12.109010
2023-11-21 16:42:40,357 [INFO] - Total patches sampled: 1566
2023-11-21 16:42:40,459 [INFO] - Patches saved to: E:\CellViT\output
2023-11-21 16:42:40,460 [INFO] - Total patches sampled for all WSI: 1566
2023-11-21 16:42:40,460 [INFO] - Time usage: 0:02:27.279251
2023-11-21 16:42:40,482 [INFO] - Finished Preprocessing.
FabianHoerst commented 8 months ago

To verify if the coordinate grid is shifted, you could check one of your extracted tiles and check if this matches with the grid.

FabianHoerst commented 8 months ago

I can confirm that there is a coordinate shift between image and grid. I am going to fix this in the next days. Could you please check if there is no coordinate shift for your files if you are using the following setting: python preprocessing\patch_extraction\main_extraction.py --wsi_paths "E:\data\he_data" --output_path "E:\CellViT\output" --wsi_extension ndpi --patch_size 1024 --patch_overlap 0 --min_intersection_ratio 0 .001 --target_mag 2.5

JLrumberger commented 8 months ago

The extracted tiles only match the grid in the beginning (top-left). I added for two TMA cores the respective patch suffixes
patch_numbering

FabianHoerst commented 8 months ago

Yes, definitely a mismatch between the tissue detection grid and patch extraction on the requested level.

JLrumberger commented 8 months ago

I can confirm that there is a coordinate shift between image and grid. I am going to fix this in the next days. Could you please check if there is no coordinate shift for your files if you are using the following setting: python preprocessing\patch_extraction\main_extraction.py --wsi_paths "E:\data\he_data" --output_path "E:\CellViT\output" --wsi_extension ndpi --patch_size 1024 --patch_overlap 0 --min_intersection_ratio 0 .001 --target_mag 2.5

This looks good. So it's the overlap that messes it up?

FabianHoerst commented 8 months ago

I am not sure for now. Possible options are the overlap, the combination of overlap and a patch size that is not a power of 2 (1024-2*32=960) or just that the requested size if 960 (see the warning). If you use an overlap that results in a patch-size that is a power of 2, the warning should disappear and the grid should match.

JLrumberger commented 8 months ago

Actually this warning is weird. My command is python preprocessing\patch_extraction\main_extraction.py --wsi_paths "E:\data\he_data" --output_path "E:\CellViT\output" --wsi_extension ndpi --patch_size 1024 --patch_overlap 6.25 --min_intersection_ratio 0.0001 and the extracted patches are 1024 x 1024, and the overlap is 1024*6.25=64, but I do get the warning

2023-11-21 17:30:19,448 [WARNING] - The given patch size 960 is not a power of 2. For a better background selection please consider using a power of 2.

JLrumberger commented 8 months ago

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L108

Changing this line to

return patch_size, overlap

looks like it does the trick. I'll run predictions to confirm, but the extracted patches match the grid shown in the tissue_mask file.

FabianHoerst commented 8 months ago

Thanks for your input and contribution! In the meanwhile, I am going to check what happens if no overlap is used

JLrumberger commented 8 months ago

There was btw. another bug that I fixed before

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/cell_segmentation/inference/cell_detection.py#L402

this line threw an error, idx is the index from an enumerate statement, thus it's an integer and does not have a shape.

Well thank you for the making model and library easily accessible :)

FabianHoerst commented 8 months ago

Thanks for fixing, the print statement was a leftover from debugging I guess which I forgot to remove.

FabianHoerst commented 8 months ago

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L108

Changing this line to

return patch_size, overlap

looks like it does the trick. I'll run predictions to confirm, but the extracted patches match the grid shown in the tissue_mask file.

This is not the fix, as the resulting tiles now have a patch-size of 1088 pixels (1024+2xoverlap = 1024+2x32=1088)

JLrumberger commented 8 months ago

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L108

Changing this line to

return patch_size, overlap

looks like it does the trick. I'll run predictions to confirm, but the extracted patches match the grid shown in the tissue_mask file.

This is not the fix, as the resulting tiles now have a patch-size of 1088 pixels (1024+2xoverlap = 1024+2x32=1088)

On my side they have a patch-size of 1024 x 1024. Could it be the openslide version or some other dependency that causes this difference in behavior?

FabianHoerst commented 8 months ago

Could you maybe try this fix? Changing https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L416 to downsample_tile_size =downsample_patch_size*1

Please keep in mind to change back: https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L108

FabianHoerst commented 8 months ago

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L108

Changing this line to

return patch_size, overlap

looks like it does the trick. I'll run predictions to confirm, but the extracted patches match the grid shown in the tissue_mask file.

This is not the fix, as the resulting tiles now have a patch-size of 1088 pixels (1024+2xoverlap = 1024+2x32=1088)

On my side they have a patch-size of 1024 x 1024. Could it be the openslide version or some other dependency that causes this difference in behavior?

Are you using openslide or cucim?

FabianHoerst commented 8 months ago

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L108

Changing this line to

return patch_size, overlap

looks like it does the trick. I'll run predictions to confirm, but the extracted patches match the grid shown in the tissue_mask file.

This is not the fix, as the resulting tiles now have a patch-size of 1088 pixels (1024+2xoverlap = 1024+2x32=1088)

On my side they have a patch-size of 1024 x 1024. Could it be the openslide version or some other dependency that causes this difference in behavior?

Are you using openslide or cucim?

We are using openslide 3.4.1

JLrumberger commented 8 months ago

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L108

Changing this line to

return patch_size, overlap

looks like it does the trick. I'll run predictions to confirm, but the extracted patches match the grid shown in the tissue_mask file.

This is not the fix, as the resulting tiles now have a patch-size of 1088 pixels (1024+2xoverlap = 1024+2x32=1088)

On my side they have a patch-size of 1024 x 1024. Could it be the openslide version or some other dependency that causes this difference in behavior?

Anyways, when looking at the predictions it predicted the right patches but stored them at the wrong positions:

image

I also use Openslide 3.4.1. Now, I'll try your fix.

FabianHoerst commented 8 months ago

I think this needs a deeper investigation, I hope I can allocate some more time at the end of this week.

FabianHoerst commented 8 months ago

I think this needs a deeper investigation, I hope I can allocate some more time at the end of this week. I think another possible error cause could be that the patches at the border are having a padding, which could result in a coordinate offset for the grid.

JLrumberger commented 8 months ago

Could you maybe try this fix? Changing

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L416

to downsample_tile_size =downsample_patch_size*1 Please keep in mind to change back:

https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L108

This fixed the issue for me: image

Thanks a ton!

FabianHoerst commented 8 months ago

I narrowed down the problem. On the one side the fix I provided previously. On the other hand, to following ceiling operations lead to an accumulation of pixel errors, which might result in another shift if many patches are extracted: https://github.com/TIO-IKIM/CellViT/blob/efa408e5f9af3e7242fdcf95ca73c6cd0dbe7384/preprocessing/patch_extraction/src/utils/patch_util.py#L430-L432

In one of my examples, the offset was 0.875 px # 0.4 px overlap for each slide, resulting in an additional shift of 50 pixels on the bottom right corner.

I am fixing the issue and close the thread once everything is pushed