Open keenhl opened 1 year ago
Good question. At least compared to the original version of MAGeCK (there have been a few updates and I haven't kept up to date), I'd say there are three differences:
sgRNA counts -> phenotype scores: The approach MAGeCK uses is more sophisticated but may be harder to directly interpret. It uses modeling of dispersion to correct for noise at lowly-represented sgRNAs, similar to DESeq if you're familiar with RNA-seq analysis, whereas this pipeline just measures log2 fold-enrichment without correction and applies a counts threshold to exclude very lowly-represented sgRNAs.
sgRNA-level -> gene-level phenotypes: This pipeline uses two partially orthogonal metrics to score genes based on the sgRNAs targeting that gene:
MAGeCK uses just a rank-based p-value, which in my opinion is less interpretable. But there is no reason you couldn't take the MAGeCK results from sgRNA counts->phenotypes and apply whatever statistical tests you prefer to get gene scores.
You can certainly try both and see what is easiest to implement (my pipeline may not be the most user-friendly) and what gives you results that are interpretable and can be functionally validated.
Thanks for making this tool available. It was recommended to me by a colleague. I'm new to this kind of analysis and was just curious about the advantages/disadvantages of this tool compared to the MaGeck software.
Thanks for the help.