williamritchie / IRFinder

Detecting intron retention from RNA-Seq experiments
53 stars 25 forks source link

differential IR analysis #173

Open Ananya1006 opened 1 year ago

Ananya1006 commented 1 year ago

Hey! First of all thanks for the amazing work on IRFinder. I have used the tool and was able to successfully get the IRFinder output files. I want to do a differential intron expression analysis for my dataset, I have three conditions of mice - fed, fast and refed. For each condition I have two replicates. I have created the files as mentioned in the manual , but I am getting this error? could you possibly look into what might be wrong here?

metaList=DESeqDataSetFromIRFinder(filePaths=paths, designMatrix=experiment, designFormula=~1) [1] "processing file 1 at C:/Users/itsan/OneDrive/Desktop/Irfinder/SRR23565903_fed_irfinder/IRFinder-IR-dir.txt" [1] "processing file 2 at C:/Users/itsan/OneDrive/Desktop/Irfinder/SRR23565905_fed_irfinder/IRFinder-IR-dir.txt" [1] "processing file 3 at C:/Users/itsan/OneDrive/Desktop/Irfinder/SRR23565909_refed_irfinder/IRFinder-IR-dir.txt" [1] "processing file 4 at C:/Users/itsan/OneDrive/Desktop/Irfinder/SRR23565910_refed_irfinder/IRFinder-IR-dir.txt" converting counts to integer mode dds = metaList$DESeq2Object
colData(dds)
DataFrame with 8 rows and 4 columns SampleNames Condition IRFinder sizeFactor

intronDepth.SRR23565903 SRR23565903 NA IR 1 intronDepth.SRR23565905 SRR23565905 NA IR 1 intronDepth.SRR23565909 SRR23565909 NA IR 1 intronDepth.SRR23565910 SRR23565910 NA IR 1 maxSplice.SRR23565903 SRR23565903 NA Splice 1 maxSplice.SRR23565905 SRR23565905 NA Splice 1 maxSplice.SRR23565909 SRR23565909 NA Splice 1 maxSplice.SRR23565910 SRR23565910 NA Splice 1 design(dds) = ~ Condition + Condition:IRFinder dds = DESeq(dds) Error in designAndArgChecker(object, betaPrior) : full model matrix is less than full rank resultsNames(dds) character(0)
dg520 commented 1 year ago

@Ananya1006 There are two layers of problems/questions to think about.

The first layer is about coding and data integrity and I can see at least the following two issues: a) Missing data: you said you had three conditions, each with two replicates. That means you are supposed to see six rows in your design matrix experiment. And you should in turn see 12 rows in colData(dds). But that was not the case. It seemed to me you only had four rows in the design matrix experiment.
b) Wrongly read values into R: somehow the "Condition" column of your colData(dds) was not built correctly, with all NAs as values. This surely failed any modeling attempt involving "Condition" as an independent variable. Please check whether the "Condition" column was read correctly in your design matrix experiment in the first place.

The second layer is about statistics. Only two replicates per condition are too few to robustly fit a linear model IR. You can still try it, but I would strongly recommend using alternative statistical tests. IRFinder includes the Audic-Claverie test to handle such a situation. Maybe you want to start from there.

Feel free to let me know if you have further questions.