fiji / Colocalisation_Analysis

Fiji's plugin for colocalization analysis
http://imagej.net/Coloc_2
GNU General Public License v3.0
24 stars 18 forks source link

Adding a Manual Threshold Mode #65

Open lacan opened 7 years ago

lacan commented 7 years ago

Hi I am forking the repo and am aiming to implement the possibility to add a manual threshold to the images. The Autothreshold method is not useful in a lot of coloc experiments and fails in the case of negative controls or conditions where colocalization is not expected. Makes sense, as it looks for correlation until there is none left to the left of the thresholds, but when there is either constant correlation or no correlation at all, the thresholds can be either too high or too low.

Also, Manders Coefficients, and other metrics not related to correlation still yield very useful information when looking at co-occurrence and not co-localization, but can be hard to interpret due to potentially very different thresholds being found.

Another thing we have tested is the use of independent auto-thresholds for each image (from ImageJ's auto-threshold list)

For example, suppose we want to check the association between some vesicule marker and a protein of interest. Not all vesicles might contain the protein, only some, so correlation is not very interesting nor warranted for this, but Manders can be helpful (barring other conditions beyond the scope of this example are met). Due to expression changes/microscope component stability, sample age or timepoint, a fixed threshold would not yield acceptable results, but we found a threshold that adjusts itself helps narrow the error margins.

That being said, I'm looking at how to get this to work.

Seeing the current implementation, all Algorithms that use a threshold call the AutoThresholdRegression<T> autoThreshold = container.getAutoThreshold(); https://github.com/fiji/Colocalisation_Analysis/blob/master/src/main/java/sc/fiji/coloc/algorithms/MandersColocalization.java#L157

So I'd need a way to store the threshold values somewhere, which would mean making a static function, but that is not the Ops way...

So I am thinking of setting a threshold on the images (whatever it may be) externally from Coloc2 and have Coloc2 just apply that threshold if it is present. Does that make sense?

So in order to do that, I should threshold the image, create a 'mask', which when fed into coloc 2, should give me the desired result?

chalkie666 commented 7 years ago

@Iacan I will leave implementation details to others more tuned into that, but discuss the general concept end here.

I agree that allowing the user to choose any 1 of the many autothreshold methods already implemented in IJ is a good thing to have. I thought about that in the last but we never got around to implementing it.

I worry about the danger of allowing arbitrarily chosen manual thresholds. This could very easily lead a naive user, who has not carefully chosen the thresholds by some objective reproducible means, into believing nonsense output is meaningful. I prefer the idea of leading people in a more robust direction by making it easiest to do things more reproducibly and objectively, and hardest to cheery pick and game the maths to produce a desired result.?

chalkie666 commented 7 years ago

@lacan In short, why not just add the IJ autothreshold methods to the list of thresholding methods to select from. We would need to relabel Costes and bisection to indicate they are both 2 channel correlation search based Costes autothresholds and bisection is just accelerated version of the same, then have the builtin single channel IJ autothresholds in a subsection of intensity histogram segmentation methods. Might need a switch to choose between 2 channel correlation.based autothresholds and 1 channel based intensity histogram split methods. Might need different autothreshold 1 channel method for each of the two channels?

lacan commented 7 years ago

@chalkie666 Thank you for the discussion and the feedback. While I agree that manually setting thresholds can be very dangerous, I think that the option should still be made available, though perhaps not as easily as selecting the auto-threshold method. I feel there is as much risk for the other methods; if the selected auto-method fails for whatever reason (different conditions are too different and the base hypotheses for a certain algorithm no longer hold), the data can also be meaningless. Assessing the resulting masks for each channel and the fluorogram is important no matter what. With coloc2, on most of the datasets that come thorugh, we get a warning about a low intercept or something wrong with the auto-threshold method almost systematically, which pushes us back to JACoP. In our facility, we tend to follow the users from sample prep to acquisition to data analysis and iterate several times. So I can kind of conclude that there is a non-negligible number of people that are getting these warnings and simply ignore them, so how is this different from having a manual threshold option?

We cannot hope to fool-proof the entire program, as coloc data is rather tricky to interpret and there are plenty of ways to perform it. I sincerely believe that with proper controls on the sample and the microscope, hard thresholds can make sense, as much as other methods.

We can warn people, and train them but I'd still appreciate having the most flexible tool - even if it has a risk of providing skewed data to the untrained user - over a big red "Analyse" button any day (Though that's because it's my job 😝 )

Thanks again for taking the time!

chalkie666 commented 7 years ago

@lacan Yes, i basically agree with what you say. So long as we make an effort to lead people away from bad practice towards better practice... make the more robust methods the sensible defaults, but also allow freedom to close to the cliff, then i thinks we are doing it right.

regarding the warnings... they are just warnings...not errors.... perhaps we should tone then down a bit so they are not so scary... Would that help? eg. the intercept thing just shows you that the images have different zero offsets and/or background, which is good to know, as it affects the linearity of the info.... but it doesn't affect eg Pearson's correlation.

We aim to educate the user a little so they are aware of the limitations of the algorithms (eg looking for correlation where there is none). I expect its very possible to improve the user feedback in these cases, to help them understand what is going on, eh when Costes' auto threshold fails... it can actually be a sign that indeed there is no correlation. Currently it looks like an error, but really its just a negative result ;-)

Its great you are interested in this, and i will support you in your dev efforts from the theoretical end, even if i cant do much coding currently!

Thanks!

lacan commented 7 years ago

So regarding the implementation of the other autothresholds , I'll offer a solution this week hopefully.

We should probably change the logic of the code so that we can FIRST run the threshold on the dataset, and use this result for the rest of the pipeline. this avoids having to call the method for each algorithm that needs a threshold, no?

chalkie666 commented 7 years ago

Indeed. Unless there is a good reason to have different threshold for different measures?

etadobson commented 7 years ago

Not sure what to say here. I think it's great @lacan - that you want to work on Coloc 2... adding new functionality. I agree with @chalkie666 - that in general, this is a slippery-slope - especially for colocalization analyses and that we need to take steps to be sure users lean on more robust, objective methods for these measurements. I agree for the terminology changes of the warnings... I can handle that.

We should perhaps discuss how much you want to dive into Coloc 2 @lacan -- especially since the idea is to move everything into Ops and eventually restructure this plugin in the ImageJ2 framework. Perhaps it's worth another discussion (perhaps along with @ctrueden as well) before you get too far??

lacan commented 7 years ago

@etarena no problem. Would you like to have this discussion via some sort of webconference or via Gitter?

etadobson commented 7 years ago

@lacan - why don't we arrange a Skype meeting? with kevin and @ctrueden ... @chalkie666 if you want to join...

chalkie666 commented 7 years ago

@lacan @etarena im in germany, time zone wise... skype chalkie666

etadobson commented 7 years ago

Hi @lacan (and @chalkie666 ) ... so I just had a chat with Kevin. We discussed that perhaps this is the best course of action: What it comes down to - we don’t want to get in your way… If you really need to get this functionality added to Coloc 2 asap to get the data you need… we obviously understand the pragmatic side of just ‘getting things done’ and adding it to Coloc 2 as-is - not worrying about the re-write issue.

If you have the bandwidth to contribute to the new framework … we should 100% have a live chat. Coloc 2 will eventually be built in the new IJ2(Ops) framework - but also based on a novel statistical framework with one of our collaborators here.

It’s up to you really - what you feel you can/need/want to contribute. But for sure - we here at LOCI are always here to help where/when we can and at the end of the day - as long as users get the functionality they need - that is the goal.

sound good???

lacan commented 7 years ago

Hi @etarena, thank you for the message. I have been a bit off the subject in the last couple of days and I apologize for the lack of interactivity. I am afraid, after browsing the code, seeing where the new Ops are now living that the current state of Coloc2 as we would need it would require a relatively important overhaul to implement the functionality that we need and that we can comfortably provide to our users. Moreover because of the very active development, it would be in constant flux, making it hard to ensure consistency for the users. Some versions would need to be frozen in case the person wants to publish the results along with the scripts, which would lead to a messy ecosystem. As of right now I discussed with my superior and we will quickly add the functionality that we are missing into JACoP (The whole thing is only 1200 lines). My hope is that this will serve as a good example of the kind of analyses we need to perform and that based on this experience I can later contribute further to Coloc2.

I am very grateful for all the work that you are doing and am very open to collaborating further with you and help this project mature even further!