Resistant Cell Analysis

eric-czech commented 5 years ago

From jeff:

We interested in classifying the number of cells with a growth rate > lamda_x (which is the 24-hr hour growth rate). For each drug concentration and/or each drug. We are interested in comparing the equation of best-fit line, and slope, for multiple drugs.

Example figure:

Pseudo-code:

Choose grid of lambda values
For each lambda L
- (ctR= ) Find t0 count for apartments with lambda > L (usually this is one cell with initial conditions = 'single_cell')
- (ctS = ) Find total number of cells with valid growth rate
- Compute percent resistant cells as 100 * ctR / ctS
Visualize over lambda grid

yellenlab commented 5 years ago

@eric-czech, @jmotschman ,

eric-czech commented 5 years ago

Hey @benjaminyellen (and @jmotschman),

Ah interesting, that's a cool idea! I tried it as part of the code merged in https://github.com/hammerlab/celldom/pull/61 by adding those predicted average cell counts to actual average counts across whole arrays. My definition of S(t) is a little different because there's an intercept in the model (otherwise growth rates are always positive making S(t) monotonic) and because of the log(y + 1) target, but at least in spirit I think this is what you're suggesting.

Here's an example of what TTR looks like for the old G3 dataset:

ttrexample

Note: This only includes apartments that started with 1 cell

That all might look better with some more time points for some of those arrays, but either way I thought one of the most striking things there was the increase in count after time zero before the death of many of the treated cells. Does that line up with what you would have expected? I was imagining there would only be one inflection point instead of two with opposite directions (e.g. in gravity:Pink:1:0.5uM).

All-in-all it looks to me like this is more evidence that a simple exponential growth model may just be too inflexible outside of the control groups if there's no other way to deal with that initial proliferation period.

Either way, this will show up when you run the array_analysis command, but in this case you would have to add a flag to force recalculation of growth rates (I wasn't bubbling up the intercept before). Here's an example:

celldom run-array-analysis \
--experiment-config-path=/lab/repos/celldom/config/experiment/exp-20180614-G3-K562-imatinib-poc-01.yaml \
--output-dir=/lab/data/celldom/output/20181005-G3-full \
--force-view-calculation=True # This will force recalculation of growth rates

The whole analysis runs ~10x slower with that flag, so make sure to remove it after the first time.

yellenlab commented 5 years ago

eric-czech commented 5 years ago

Hey @benjaminyellen ,

I've never had good experiences with MLE estimators for cutpoint models and I know they work well with bayesian models, but they'd be super slow IMO. Not that it can't be done -- I just don't have any clue how to make them well given that the gradients aren't smooth.

Instead I was puttering around a bit, looking for something along those lines, and found some things on birth-death models and though I couldn't find a reference for exactly what I think we'd need, it doesn't seem to bad to put together a model specifically for the dynamics to catch whatever we want, solve the related ODE, and then wrap it in some estimator. As an example that seems to work pretty well, I tried modeling the dynamics like this:

Which apparently has the solution (from Wolfram):

Wrapping that an estimator (here's a notebook w/ the code) and fitting to some of our data shows that it seems like those 3 parameters do a solid job (and getting the extra "death rate" parameter could be useful). This sample is 10 apartments from each of 5 clusters in a G1v2 experiment showing significantly different behaviors over time:

examples

Anyways, if you bump into some other models that you'd like to try let me know. I'd be happy to take a crack at seeing what they look like now that I have a decent sense of how to experiment with them.

eric-czech commented 5 years ago

Here's a second take on this that is closer to the model you suggested and has a nicer interpretation:

Notebook
Definition:
Solution:

This then gives the time at which the growth rate changed as well as the before and after rates (shown below in the title bars):

examples2

This would also give the potential to learn the speed with which the change occurs rather than assuming it happens almost instantaneously. Right now I'm just hardcoding that parameter to make the interpretation simpler, but it's a possibility.

hammerlab / SmartCount

Resistant Cell Analysis #59