Analysis - Githubissues

tlnagy / Crispulator.jl

✂️ Pooled CRISPR screen optimization tool

Other

19 stars 6 forks source link

Analysis #12

Closed tlnagy closed 8 years ago

tlnagy commented 8 years ago

This issue will track some of the ideas we have for figures/analysis

[x] Generate heatmaps of AUROC as a function of representation and # cells per bin at different noise levels
[ ] Generate heatmaps of AUROC with different analysis methods (pvalue vs logfc vs product, etc)
[ ] Tradeoff of different binning methods. More cells collected vs less stringency.
[ ] tradeoff between number of bottlenecks and strength of bottleneck

tlnagy commented 8 years ago

@martinkampmann What is a good range to test for the minimum number of cells per bin? I currently have 2x10^6 cells as the minimum with 500 genes and 5 guides per gene.

martinkampmann commented 8 years ago

I would definitely simulate down to a pretty low coverage. The reason is that you're not only simulating the bottleneck based on the bin size, but also later bottlenecks (e.g. DNA lost during sample prep) that way. How about 25,000 cells as the minimum (corresponding to 10-fold representation) and 2.5 million cells as the maximum (corresponding to 1,000-fold representation)

tlnagy commented 8 years ago

My initial choice was really similar, I picked 2e4 to 2e6 as my range, but using 2.5 is better because it maps onto the number of guides better.

tlnagy commented 8 years ago

Example heatmap with the mean auroc computed across 10 runs.

heatmap

tlnagy commented 8 years ago

@martinkampmann Here are the plots for the 3 different noise levels. There is a straightforward trend that as the noise of the readout increases, the importance of the minimum number of cells per bin rises, while the dependence of the auroc on representation is roughly the same across noise levels.

heatmap

tlnagy commented 8 years ago

This issue is superseded by #29