cytomining / profiling-handbook

Image-based Profiling Handbook
https://cytomining.github.io/profiling-handbook/
Creative Commons Zero v1.0 Universal
8 stars 7 forks source link

Manually remove Costes features #52

Closed shntnu closed 4 years ago

shntnu commented 4 years ago

We have found this to be a problematic feature across several projects now and decided to remove this feature going forward. There are potentially ways of addressing the issues (see point 2 below) but given that the information is captured in other, better behaved channel correlation features, its easier to just remove this.

Notes for ourselves:

  1. https://github.com/broadinstitute/cmQTL/issues/30#issuecomment-620121632
  2. David Stirling said: I've been looking into why we're seeing excessively large negative values for some 'Costes' statistics from the MeasureColocalisation module. It looks like the raw values haven't really changed from CellProfiler 2 to CP3, and these outlying values appear during the normalisation step. The vast majority of samples on a plate will have raw Costes values in the range of 0.99-1. I think this means that during normalisation a select few wells reading 0.96 are ending up with values normalised to -50 or lower. These excessive values appear to be the cause of the unusually intense clusters some of us have seen in similarity matrices, removing these measures tends to give a better matrix. I'm not sure if this is actually a problem with our normalisation approach, or if we should just be excluding these measurements. Thoughts welcome.
shntnu commented 4 years ago

cc @bethac07