Possible Patch related errors for detection

BiaPyX / BiaPy

Open source Python library for building bioimage analysis pipelines

https://BiaPyX.github.io

MIT License

114 stars 28 forks source link

Possible Patch related errors for detection #50

Closed ClementCaporal closed 5 months ago

ClementCaporal commented 5 months ago

Hello !

I encountered some problems while working with patches.

here the local patch coordinates are added in the global coordinate system without shift from the actual patch origin. https://github.com/BiaPyX/BiaPy/blob/d084bedbecc4d1618965f60f1f18f703172801b0/biapy/engine/detection.py#L518
here When using multiple GPU i think that there is a problem when slicing the data in z as it is not done exactly the same way than here.
Finally I think the patch detection should be done inside a padded data otherwise there is an edge effect that appear (red and magenta arrow in the following figure):

I am working on a patch in my local branch. I propose that I open a PR once #48 and #49 are merged?

danifranco commented 5 months ago

Hello!

You are right, the global coordinates are not calculated, my bad!
If selected, the patches inserted and extracted need to calculate padding and overlap so calculate the coordinates is not trivial. I checked multiple times that exactly the "real" patch, removing overlap/padding from it, was inserting in the correct place, but you can double check.
Did you generate predictions changing DATA.TEST.PADDING? the detection process is done after all the patches have been merged so those border effects should not be there if some padding was selected... (in our cases, padding was enough to edge effect without using overap). In line with this, and after of correcting these errors, remember also that you can post-process the final points by using TEST.POST_PROCESSING.REMOVE_CLOSE_POINTS .

ClementCaporal commented 5 months ago

Hello, thank you for your answers

I focused on two points proposed in this PR #53.

I added the global coordinate shift when adding the local patches.
I propose to also add a patch approach for the detection. I agree that the DATA.TEST.PADDING (in my case (4, 30, 30) i didn't change it), solve the problem of the edge effect on the inference part of the pipeline. However, the patch detection (using blob_log or peak_local_max) can also create a patch effect. I try to discuss it here in this private notebook : https://github.com/LaboratoryOpticsBiosciences/brainbow-tools/blob/deep-learning/docs/center_point_detection/log_detection_edge_effect.ipynb. The first solution is to remove the edges detection (using exclude_border) but this doesn't solve the problem as for our case some cells in the middle of two patches might be undetected. if I understand well the pipeline the border parameter is only used by the inference steps. So I propose to do again patches (using the same parameters already introduced) and then filtering "out-of-patch" detections. You can see an example of results of the second point here. The border effect is still present on the border of the image (but not anymore on the patch edges). We could solve it by setting the parameter exclude_border to False here

If you think it can be implemented differently or I didn't understand something please tell me!

Thank you,

Clément

danifranco commented 5 months ago

Hi,

Thank you for the implementation in the PR #53 .

If even setting a padding a border effect still exists, due to the use of blob_log or peak_local_max functions, we need to apply a solution as you said. Btw, did you try with just larger padding? Seems that the "out-of-patch" solution you described solves the issue as you exposed, but in the PR I only see a fix to extend local coordinates to global, did you miss to add something else?

The unique problem that I can think on is in removing the whole-image border effect. exclude_border is set to False in the lines you linked, but you mean that we need it to True instead right? That process is done patch by patch, and not to the whole image, as the function that is called after whole-image reconstruction by chunks is after_merge_patches_by_chunks_proccess_patch, as you may now. There, we work only with CSV files and not with the whole image so we can not apply blob_log or peak_local_max functions. In fact, that's a huge advantage as we don't need to have the image loaded into memory, so we can scalate with this approach to large images without problems. Having said that, I would try first to see how the already implemented post-proccessing, e.g. TEST.POST_PROCESSING.REMOVE_CLOSE_POINTS, works here. If the result is not what we want, I think that we should try to implement, based on points coords, what the exclude_border variable does under those functions. Doing this we would be fixing border-effect in the whole image and also, and very important, we'll be avoiding to load it into memory. What do you think?

Thank you for you work Clement!

Dani

ClementCaporal commented 5 months ago

Hello,

Thank you for the review!

I think I didn't forget to add something in the PR: here are where I add the padding, and remove the padding
I checked using a larger padding in the inference. As the problem comes from after it doesn't seem to change the edge effect while using blob_log.
Oups! Indeed I wanted to say put the exclude border to True! But the problem will only appear on the edge of the whole zarr so where there shouldn't be any cells. But maybe in other context it should be removed. I added a commit to change the exclude border from False to True
I don't think there will be a memory issue as the exclude border is from the blob_log / peak_local_max that are already in patch by patch mode. So nothing should change in this regard.
Concerning the remove_close_points it is already added in the pipeline of the example I showed. From my tests it didn't removed the edges effects. I agree it will remove the cell detected in two edges, but the edge effect is already visible when considering only one patch.

Above is concerning the already proposed PR #53 .

I think I will need to create a second PR to close this issue about multiple GPU merge

I tried to describe it here:

here When using multiple GPU i think that there is a problem when slicing the data in z as it is not done exactly the same way than [here](https://github.com/BiaPyX/BiaPy/blob/d084bedbecc4d1618965f60f1f18f703172801b0/biapy/data/data_3D_manipulation. For example when I run the same pipeline with one or two GPU I will not have the same result. Here you see a z view of the result. The second GPU should take care of the second part (bottom part) of the image. It seems that those results are not added when using 2 gpu?

I will try to isolate the index parameter that is needed on this line

danifranco commented 5 months ago

Hi Clement,

We are not understanding each other I think (points 3 and 4). The border problem you described is to the whole image, after the reconstruction. exclude border to True that you have set is done patch by patch during the reconstruction, so you are not going to achieve what you want as you are going to end up with the scenario you described here:

""" ... (using exclude_border) but this doesn't solve the problem as for our case some cells in the middle of two patches might be undetected. ... """ If you select exclude border to True that will happen. That's why I described that you should do it in the whole image, and not when you are working in the whole image reconstruction process. But doing so led to the memory problematic as I described.

Another thing, maybe is better to have a varaible in conf.py to change exclude border value. What do you think?

Regarding the multi GPU problem, try to print the slices each GPU is in charge of so you can see if there is some strange behaviour.

ClementCaporal commented 5 months ago

I agree that we don't understand each other, maybe we are even agreeing without knowing.

I propose here my written understanding of the situation. Or maybe we try to call each other?

The main issue for me is that blob_log will create false centroids in the edges of the array. For example here, the green cell in the left will be detected in the edge even if the center of mass should not be on the edge. This lead to a high number of detection on the edges as shown in the graph that will biais our spatial analysis. (aka If we find some cells super aligned (edges-cell-artefact.) is it because they where on the edges of a chunk?). I will represent this wrongly detected cell in blue. Note that in my example only one wrong edges-cell-artefact is shown but from my experiments there are a LOT of false positive detection created by blob_log

In the current pipeline this will lead to wrongly detected cells on all chunk edges. Even with the remove close points, I think we will still have those edges-cell-artefact..

So one solution to remove those blue points is to select the remove border in blob_log but it will create subdetected zone near the edges

If we create padding for each chunk, it can also help to remove the problem of edges-cell because we can remove the edges-cell without creating sub-detected zone (only where there are another chunk next to it to take the padding from). So it will leave some edges-cell-artefact "image-wise" but not chunk-wise.

Combining the two approaches (remove border + padding), there are subdetected areas but in the border of the image. In our case this shouldn't be a problem as there are no real cell in the border of our image, only False positive.

That is why in my understanding there is no image-level operation to do, only chunk-level operation needed to solve the problem without memory issue.

Sorry for the long answer @danifranco , I needed this to be sure I understand the proposed solution. If you think a call can be a good solution to discuss we can find a time!

Thank you again

danifranco commented 5 months ago

Wow, good problem representation Clement! Now yes, we are on the same page. Thank you for doing the images.

Just as I commented, should we created a variable to control exclude_border argument in blob_log/peak_local_maxfunctions? So we can control its value through the yaml configuration file and to not have it always static to True.

Thank you so much again!!!

ClementCaporal commented 5 months ago

Yes putting it in the configuration is a good idea! I will add it today then. I should also check if the documentation needs to be updated.

Do you think the detection-padding should be the same than inference-padding? I think it is a good solution

ClementCaporal commented 5 months ago

I added the parameter in the PR #53. So in the PR is implemented the discussed solution presented in the last drawings. So I think it is ready to merge.

I will now try to understand the multi GPU behavior!

danifranco commented 5 months ago

Thank you so much Clement!

ClementCaporal commented 5 months ago

I think I found something concerning the multiple GPU behavior!

Before PR logs

[13:48:42.358947] ### 3D-OV-CROP ###
[13:43:06.294228] Cropping (1, 3, 26, 512, 512) images into (26, 128, 128, 3) with overlapping (axis order: TCZYX). . .
[13:43:06.294235] Minimum overlap selected: (0, 0, 0)
[13:43:06.294243] Padding: (4, 30, 30)
[13:43:06.294306] Real overlapping (%): (0.5555555555555556, 0.058823529411764705, 0.058823529411764705)
[13:43:06.294320] Real overlapping (pixels): (10.0, 4.0, 4.0)
[13:43:06.294327] (2, 8, 8) patches per (z,y,x) axis
[13:43:06.294344] List of volume IDs to be processed by each GPU: [[0], [1]]
[13:43:06.294353] Positions of each volume in Z axis: {0: [0, 8], 1: [8, 16]}
[13:43:06.294366] Rank 0: Total number of patches: 64 - (1, 8, 8) patches per (z,y,x) axis (per GPU)

The Positions of each volume in Z axis: {0: [0, 8], 1: [8, 16]} seems wrong to me because it should be {0: [0, 18], 1: [8, 26]} (If the image shape is (1, 3, 26, 512, 512) then there is an overlap of 10 pixels in the z dimension. So it means that z_vol_info is given wrong info for the rest of the pipeline, in particular here when loading the inference results to compute the detections: https://github.com/ClementCaporal/BiaPy/blob/6aa291baa9bc5d7fb410454bfcea3a3da0c23604/biapy/engine/base_workflow.py#L981.

So I propose in the PR #54 that to add the ovz_per_block to fix this.

After PR Logs:

[13:48:42.358947] ### 3D-OV-CROP ###
[13:48:42.358972] Cropping (1, 3, 26, 512, 512) images into (26, 128, 128, 3) with overlapping (axis order: TCZYX). . .
[13:48:42.358984] Minimum overlap selected: (0, 0, 0)
[13:48:42.358993] Padding: (4, 30, 30)
[13:48:42.359078] Real overlapping (%): (0.5555555555555556, 0.058823529411764705, 0.058823529411764705)
[13:48:42.359097] Real overlapping (pixels): (10.0, 4.0, 4.0)
[13:48:42.359108] (2, 8, 8) patches per (z,y,x) axis
[13:48:42.359134] List of volume IDs to be processed by each GPU: [[0], [1]]
[13:48:42.359152] Positions of each volume in Z axis: {0: [0, 18], 1: [8, 26]}
[13:48:42.359170] Rank 0: Total number of patches: 64 - (1, 8, 8) patches per (z,y,x) axis (per GPU)

danifranco commented 5 months ago

Perfect Clement! Thank you for finding the bug and fixing it!