MannLabs / py-lmd

https://mannlabs.github.io/py-lmd/
MIT License
6 stars 1 forks source link

optimisation approaches for cutting many cells in close proximity while preserving membrane Integrity #5

Open sophiamaedler opened 2 years ago

sophiamaedler commented 2 years ago

As has been discussed offline it would be ideal to develop an optimisation approach to allow for the cutting of cells in close proximity while preserving membrane integrity. I have created this GitHub Issue to discuss in more detail with all interested parties (@fabsen-87, @josenimo, @lisaschweizer) to ensure that we are implementing a tool that meets everyone's needs.

Based on the feedback I have received so far a first approach could be to selectively eliminate individual cells from densely segmented slide areas in such a way that we make sure that the membrane stays intact and we don't collect any "wrong" membrane areas and also don't lose membrane integrity. This would probably be implemented in an iterative fashion using proximity and area filters. We would of course lose some cells but would in return be able to collect all other cells without risk of contamination/destroying the sample.

One question that @GeorgWa and I had was at what point in the processing pipeline it would make the most sense from your perspective to implement such a filter: (1) when loading the segmentation mask or (2) when actually generating the cutting XML

In addition if you have any specific requirements that such a tool would need to fulfil it would be great if you could quickly outline them here.

fabsen-87 commented 2 years ago

Hi Sophia,

Thanks for including us in the discussion.

Just one thought, it would be good if one knew the exact number of contours per sample after filtering before going to the lmd. We need to make sure that when we upload the contours into the lmd software that we don’t find by surprise that there are less contours left than we need/want per single sample.

Best

Fabian


Dr. rer. nat. Fabian Coscia Spatial Proteomics Group Max Delbrück Center for Molecular Medicine Robert-Rössle-Straße 10 Building 31.2, Room 0221 13125 Berlin Germany

Am 21.02.2022 um 19:01 schrieb Sophia Mädler @.***>:

As has been discussed offline it would be ideal to develop an optimisation approach to allow for the cutting of cells in close proximity while preserving membrane integrity. I have created this GitHub Issue to discuss in more detail with all interested parties @.*** https://github.com/fabsen-87, @josenimo https://github.com/josenimo, @LisaSchweizer https://github.com/LisaSchweizer) to ensure that we are implementing a tool that meets everyone's needs.

Based on the feedback I have received so far a first approach could be to selectively eliminate individual cells from densely segmented slide areas in such a way that we make sure that the membrane stays intact and we don't collect any "wrong" membrane areas and also don't lose membrane integrity. This would probably be implemented in an iterative fashion using proximity and area filters. We would of course lose some cells but would in return be able to collect all other cells without risk of contamination/destroying the sample.

One question that @GeorgWa https://github.com/GeorgWa and I had was at what point in the processing pipeline it would make the most sense from your perspective to implement such a filter: (1) when loading the segmentation mask or (2) when actually generating the cutting XML

In addition if you have any specific requirements that such a tool would need to fulfil it would be great if you could quickly outline them here.

— Reply to this email directly, view it on GitHub https://github.com/HornungLab/py-lmd/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/APCR26TGEH4KGEK2RNOC7K3U4J4YLANCNFSM5O7IMOCQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.

GeorgWa commented 2 years ago

Hi Fabian,

there is already a function for shape collections called Collection.stats() You could call this after loading a segmentation similar to:

sl = SegmentationLoader(config = loader_config, verbose = False)
shape_collection = sl(segmentation, 
                    cell_sets, 
                    calibration_points)

shape_collection.stats()

It will give you information on the number of shapes and the number of vertices:

===== Collection Stats =====
Number of shapes: 7
Number of vertices: 4,913
============================
Mean vertices: 702
Min vertices: 599
5% percentile vertices: 617
Median vertices: 687
95% percentile vertices: 811
Max vertices: 839

I've used it so far to optimize the compression of shapes. Let me know if this is what you are looking for.

Best, Georg

fabsen-87 commented 2 years ago

Great, thanks Georg! We will try it out and report.

Best

Fabian

Am 22.02.2022 um 20:25 schrieb Georg Wallmann @.***>:

Hi Fabian,

there is already a function for shape collections called Collection.stats() You could call this after loading a segmentation similar to:

sl = SegmentationLoader(config = loader_config, verbose = False) shape_collection = sl(segmentation, cell_sets, calibration_points)

shape_collection.stats() It will give you information on the number of shapes and the number of vertices:

===== Collection Stats ===== Number of shapes: 7 Number of vertices: 4,913

Mean vertices: 702 Min vertices: 599 5% percentile vertices: 617 Median vertices: 687 95% percentile vertices: 811 Max vertices: 839 I've used it so far to optimize the compression of shapes. Let me know if this is what you are looking for.

Best, Georg

— Reply to this email directly, view it on GitHub https://github.com/HornungLab/py-lmd/issues/5#issuecomment-1048135362, or unsubscribe https://github.com/notifications/unsubscribe-auth/APCR26VPN3MNFZCKF7JWH7DU4PPLNANCNFSM5O7IMOCQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.

josenimo commented 2 years ago

Regarding the cutting strategy @sophiamaedler @GeorgWa, to me it makes more sense to implement the cutting path at the XML export step. I am not sure exactly what loading the segmentation mask means.

Is this implementation only changing the order the contours are cut, or are some contours just skipped for the sake of the integrity of the membrane?

Best, Jose

sophiamaedler commented 2 years ago

Hi @josenimo, This implementation would skip some contours so you would indeed lose some shapes. In our opinion this more radical approach is necessary since only optimising the cutting order will not ensure membrane integrity in all cases. E.g. if you have a fully connected circle of cells as soon as you cut the last cell the middle area would fall down and be incorrectly collected even if we optimise the order in such a way that it happens as late as possible, which is something we would like to avoid at all costs. Having fewer cells available is something that users should be able to address by getting more input material/segmenting more cells/etc where as an incorrect collection could ruin an entire experiment. What are your thoughts on this? With loading the segmentation mask I was referring to the very first step in the pipeline where you import an array defining the areas of individual cells/contours so that they can be converted to an xml in later steps. The advantage of already adjusting the shapes in such a way that membrane integrity is ensured in this step is so that you get feedback as early as possible on the maximum number of shapes available in your segmentation for cutting. One concern we had if we implement this algorithm in the final export step is that the user then suddenly has much fewer cells available than he expects based on what he loaded and selected in previous steps and might end up with unequally distributed classes. So for example you load your segmentation and look at the classes and see that you have at least 500 cells available of each type so you choose to export 500 of each class to your XML. Unluckily class 1 is clustered much closer together and we need to filter out more cells than in class2 to ensure that we preserve membrane integrity. So you then actually end up with 350 cells in class 1 and 450 cells in class 2. Going back to optimise the cell selection so that you actually end up with 500 cells each could then be quite cumbersome and at least in our application it is usually quite relevant to have balanced classes. I would love to hear more of your thoughts on this issue though! Since the pipeline is quite flexible I am sure it would also be easy to implement a more flexible solution if this brings a benefit to users. Cheers Sophia

josenimo commented 2 years ago

Hey @sophiamaedler, This all makes sense now, I was a bit confused at first. I think that running the algorithm right after applying the segmentation mask makes the most sense. As you say it is important to keep the sample number consistent and comparable, and the earlier the better. Would it be possible to ask the algorithm for a certain number of cells and then it would take into account the cell positions? Because I could imagine it would take some trial and error to get down exactly a number of cells, even though I guess I would just get more than needed and then leave the rest behind.. just throwing ideas :D.

In my mind having an exact number of cells for each group is not essential, our goal is to get enough cells for each group to observe their proteins. Thoughts @fabsen-87 ? Best, Jose