nf-core / crisprseq

A pipeline for the analysis of CRISPR edited data. It allows the evaluation of the quality of gene editing experiments using targeted next generation sequencing (NGS) data (`targeted`) as well as the discovery of important genes from knock-out or activation CRISPR-Cas9 screens using CRISPR pooled DNA (`screening`).
https://nf-co.re/crisprseq
MIT License
21 stars 23 forks source link

Integrate Chronos for screen data #112

Open j-andrews7 opened 5 months ago

j-andrews7 commented 5 months ago

Description of feature

Chronos is an enticing addition to this pipeline given the ever-expanding DepMap project which also uses it.

LaurenceKuhl commented 4 months ago

Hi @j-andrews7 yes thank you it was in the backlog in my mind! Would you have a tiny script where you've used it yourself? Since i've never used it it'd help start pan out what are the input/variables etc. Thanks! Laurence

j-andrews7 commented 4 months ago

Honestly, their Github README lays it out more cleanly than I could, but I'll summarize here. In short, it really only needs three pandas dataframe:

Recommended is to also provide a list of negative_control_sgrnas as well.

Then running is just:

import chronos
# This removes clonal outgrowths that are seemingly unrelated to the perturbation
chronos.nan_outgrowths(readcounts, sequence_map, guide_gene_map)

model = chronos.Chronos(
    readcounts={'my_library': readcounts},
    sequence_map={'my_library': sequence_map},
    guide_gene_map={'my_library': guide_gene_map},
    negative_control_sgrnas={'my_library': negative_control_sgrnas}
)

model.train()

model.save("my_save_directory")

# Actual outputs people may be interested in.
gene_effect = model.gene_effect
guide_efficacy = model.guide_efficacy

They have a vignette with a more comprehensive example.