d3b-center / pbta-splicing

Splicing analysis across the PBTA
1 stars 1 forks source link

CRISPR intersection with CLK1 targets #393

Closed naqvia closed 3 months ago

naqvia commented 3 months ago

Intersect CLK1 targets with CRISPR dependency information from ​Childhood Cancer Model Atlas​.

naqvia commented 3 months ago

There are some duplicated within the same sample_id and not sure what that mean, but I essentially wrote out the list of genes that were super significant and dependent (z-score < 1.5). I am not sure how to plot this (if we do), but the following were those genes that made the cut off:

> unique(crispr_filt$gene) 
 [1] "BCL2L1" "DDB1"   "DNMT1"  "EED"    "EIF3I"  "ETF1"   "EZH2"   "FGFR1"  "HDAC2"  "NACA" 
naqvia commented 3 months ago

To summarize the past few commits, I had found a bug in previous script (09). For expression, our negative fold-change filter was inaccurate, but that bug is now fixed. Originally, we were getting all significantly expressed genes (adj p-val < 0.05) regardless of fold-change. Now with our filters of fold-change and adj. p-val, we do NOT get a lot of DES vs DEX overlap, which prompted us to intersect genes that are just DEX (regardless of splicing), since we want to query CCMA gene-dependencies, and we identify SRC and JUN. Out of these, SRC is up in CLK1 high exon 4 cells, which suggests CLK1 exon 4 splicing is promoting up-regulation of SRC. I also re-did venn to include DEX, crispr, and functional cancer DES.

cc @jharenza

jharenza commented 3 months ago

@naqvia this is completely updated and rerun now after our huddle- feel free to rerun the module, make sure you are getting the same results, then merge. I will work on updating the figures for the MS

naqvia commented 3 months ago

Ok, I made some minor changes (comments, printed extra column for sign-genes indicating preference). I changed the bedtool script to also filter based on p-value for functional sites (because in the other scripts, we filter for pval AND FDR). I was a little confused, but I think we're good now. So, if its a dPSI it should be included in high exon 4. Checking CLK1 as a sanity check:

> psi_comb%>% filter(gene=="CLK1")
# A tibble: 3 × 7
  SpliceID                  dPSI Uniprot      Type  gene  Preference Uniprot_wrapped
  <chr>                    <dbl> <chr>        <chr> <chr> <chr>      <chr>          
1 CLK1:200860124-200860215 0.525 Modification SE    CLK1  Inclusion  Modification

So this means that in high exon 4 cells (untreated) it is more included (52.5%). I think thats how we want to frame/visualize/explain things. If so, I made those changes. I reviewed every script in this module! It also runs to completion and all plots look identical!