Update debris gate for Lyoplate 3.0

ramhiser commented 11 years ago

We have applied openCyto to the Lyoplate 3.0 data for the first 4 centers. Unlike the FlowCAP 3 competition, the FSC channel was thresholded at approximately 25K. This change yields a poor debris gate for 3 and 4 mixture components along the FSC channel; in both cases, we are using only 1 positive component.

TODO: Update the debris gate

Screen Shot 2013-04-02 at 2 53 24 PM

ramhiser commented 11 years ago

Below are the 1D density plots for FSC. Notice that some of the samples from Stanford and UCLA have a noticeable cluster at about 25K. These cells are the debris that we wish to filter out.

The issue here is that in the samples that have few debris result in the first (negative) mixture component being centered at about 75K-100K. The result is that we filter out the lymphocytes. For instance, in the 2D scatterplots for NHLBI above, we can see above that the lymphocytes in the dense red regions are filtered out.

Screen Shot 2013-04-04 at 11 20 58 AM

ramhiser commented 11 years ago

To provide a clearer picture, I generated a similar density plot for interesting samples from each center. In order, we have 1 Miami, 2 NHLBI, 2 Stanford, and 2 UCLA samples.

Screen Shot 2013-04-04 at 11 31 56 AM

ramhiser commented 11 years ago

After spending much of the week working on the automated prior-elicitation and automated selection of K, the debris gates are much improved. (Image below) Specifically, we are doing a much better job of removing debris if present while not removing lymphocytes from samples that have been preprocessed with debris removal.

Greg and I have discussed ways of further improving the gates. They include

Given that we are conservative in selecting K, we should have a logical argument to collapse posterior components if they overlap. Essentially, this becomes a Bayesian t-test.
Normalize and scale samples. This should improve further the gates for Stanford and UCLA.

Screen Shot 2013-04-11 at 8 55 04 AM

gfinak commented 11 years ago

To clarify what you mean by 'collapsing' components; you mean label the components based on which prior component they are closest to, while allowing multiple components to be labelled as derived from the same prior component?

This is the alternative to setting very strong priors on the variances of the means, wherein each fitted model component was labelled as deriving from one of the prior components.

ramhiser commented 11 years ago

Yes. In terms of implementation, I've been thinking of it as an argument called collapse that defaults to FALSE. I'm open to another argument name.

RGLab / flowcap3

Update debris gate for Lyoplate 3.0 #1