Katsevich-Lab / sceptre

An R package for single-cell CRISPR screen data analysis emphasizing statistical rigor, massive scalability, and ease of use.
https://katsevich-lab.github.io/sceptre/
GNU General Public License v3.0
22 stars 7 forks source link

Control cell definition #70

Closed redbybeing closed 8 months ago

redbybeing commented 8 months ago

Hi Tim,

I moved the thread here.

My original question:

I wonder whether you could add the option to define control cells as those having 0 gRNAs for the target. When doing assign_grna(), I use thresholding method with threshold = 3. But I want the control cells to be cells with absolutely 0 gRNAs of the target, not just less than 3. Because I'm worried that even just 1 or 2 gRNA reads might still do something in the cell mildly (especially when it's CRISPR KO, not CRISPRi).

Tim:

There are somewhat hacky ways to do this within the framework of the sceptre package, but I am not sure it is worthwhile to go down this route on your data. Let me try to convince you that what you're currently doing is reasonable. Consider a given gRNA. Suppose this gRNA has a UMI count of >= 3 in 100 cells, a UMI count of 1-2 in 100 cells, and a UMI count of zero in ~80,000 cells. The cells with a UMI count of 1-2 are going to exert a negligible impact because they are "swamped" in number by the 80,000 cells with a UMI count of zero. Thus, removing the cells with a UMI count of 1-2 from the control group will have essentially zero impact on the p-value. (I've spent some time looking into this kind of phenomenon on other datasets.)

The hacky way to do this within sceptre would involve looping over gRNAs, rerunning QC separately for each gRNA. We could discuss this approach in more detail if you like, but to me this seems pretty low priority in comparison to some of the other analyses tasks that remain. Just my two cents!

I attached an excel sheet where I computed the number of control & perturbed cells with my criteria: Control: 0 gRNA reads of the target gene Perturbed: At least 3 gRNA reads of either of the two gRNAs of the target gene. (Either one should be 3 or more. 1+2=3 doesn't count.) If you look at column K-O, 1st row (Foxg1) for example, there are 69621 control cells and 2637 perturbed cells. But the original input was 88183 cells, which means 15925 cells still have 1 or 2 gRNAs expressing. Isn't that (15925) still a lot of compared to perturbed cells that could affect the results?

However, another good news is that the list of significantly working gRNAs (that I found with my custom negative binomial glm script) overlaps a lot with sceptre results. I didn't do any sophisticated QCs or permutation analysis like sceptre, so I trust sceptre results more. I did this before new sceptre was officially out.

I also attached sceptre output table and a slide in case it's helpful for discussion.

Thanks, Jiseok A06_DE_glm_gRNAtargets_only1.xlsx A08_pos_ctrl_pairs.xlsx comparison.pdf

timothy-barry commented 8 months ago

Hi Jiseok,

Thanks for this information. As much as Gene and I would like to dig into these results, unfortunately we probably are not going to have the bandwidth to do so. The main purpose of Github issues is for users to point out bugs or ask technical questions about the software. Sadly, the development team currently is too small to have detailed discussions about specific sets of results. :(

That said, if you are interested in removing cells that contain a count of 1 or 2 (for a given target), I'd recommend proceeding as follows. Go target by target. First, call set_analysis_parameters(), setting the discovery pairs to the given target paired to all genes. Next, call run_qc(), passing the indices of the cells for which the given target has a UMI count of 1 or 2 to additional_cells_to_remove (these indices can be determined by examining the gRNA UMI count matrix). Next, carry out the standard association analysis. Finally, repeat this process for all targets, looping over target one-by-one.

I hope that this is at least somewhat helpful.

redbybeing commented 8 months ago

Hi Tim!

No worries! I totally understand :-) And I understood your suggestion for removing gRNA count 1 or 2 cells. I will try this approach :) Thank you very much for your time and help. I as well as other lab members doing crispr screening are now using sceptre routinely and I am very happy that I came across this software just around the time I collected enough data. Happy winter holidays!

Best wishes, Jiseok

timothy-barry commented 8 months ago

Good to hear! Same to you. :)

And if you run into any additional bugs or problems with the software, please do let us know. Your issues have been quite helpful in making the package more reliable!