williamritchie / IRFinder

Detecting intron retention from RNA-Seq experiments
53 stars 25 forks source link

DESeqDataSetFromIRFinder error #149

Open ojziff opened 3 years ago

ojziff commented 3 years ago

Hi @dg520

I ran into a couple of errors with DESeqDataSetFromIRFinder function naming columns line 31 - 35 in https://github.com/williamritchie/IRFinder/blob/master/bin/DESeq2Constructor.R, so I converted it to tidyverse which may be of use to others. read_tsv is also much faster than read.table.

DESeqDataSetFromIRFinder = function(filePaths,designMatrix,designFormula){
  irfinder.tsv = filePaths %>% map(read_tsv)
  names(irfinder.tsv) = designMatrix$SampleNames
  irtab <- map_dfr(irfinder.tsv, bind_rows, .id = "SampleNames") %>% mutate(IntronDepth = round(IntronDepth), SpliceExact = round(SpliceExact), MaxSplice = round(pmax(SpliceLeft, SpliceRight)), irnames = paste0(Name,"/",Chr,":",Start,"-",End,":",Strand))
  IntronDepth = irtab %>% pivot_wider(irnames, names_from = SampleNames, values_from = IntronDepth, names_glue = "intronDepth.{SampleNames}") %>% column_to_rownames("irnames")
  SpliceExact = irtab %>% pivot_wider(irnames, names_from = SampleNames, values_from = SpliceExact, names_glue = "totalSplice.{SampleNames}") %>% column_to_rownames("irnames")
  MaxSplice = irtab %>% pivot_wider(irnames, names_from = SampleNames, values_from = MaxSplice, names_glue = "maxSplice.{SampleNames}") %>% column_to_rownames("irnames")
  group = bind_rows(designMatrix, designMatrix) %>% mutate(IRFinder = factor(c(rep("IR",nrow(designMatrix)),rep("Splice",nrow(designMatrix))), levels=c("Splice","IR")))
  counts.IRFinder = bind_cols(IntronDepth,MaxSplice) %>% drop_na()

  dd = DESeqDataSetFromMatrix(countData = counts.IRFinder, colData = group, design = designFormula)
  sizeFactors(dd)=rep(1,nrow(group))
  final=list(dd,IntronDepth,SpliceExact,MaxSplice)
  names(final)=c("DESeq2Object","IntronDepth","SpliceDepth","MaxSplice")
  return(final)
}

Best wishes, Oliver