hartleys / JunctionSeq

The JunctionSeq R package is a powerful tool for testing and visualizing differential usage of exons and splice junctions in next-generation, high-throughput RNA-Seq experiments.
27 stars 16 forks source link

weird dispersion plot #54

Open scseekers opened 3 years ago

scseekers commented 3 years ago

Hi @hartleys

I have RNASeq data for 4 tissue types A,B,C,D. I wish to perform a pairwise comparison of these samples (A-B, B-C, C-D, B-D, A-D) and therefore used the following code:

library(JunctionSeq)
gtf.file <- "./withNovel.forJunctionSeq.gff.gz"
samplesheet <- read.csv("samplesheet.csv", header = T, stringsAsFactors=FALSE)
jscs <- runJunctionSeqAnalyses(sample.files = countFiles,
                               sample.names = samplesheet$sample.names,
                               condition=factor(samplesheet$condition),
                               flat.gff.file = gtf.file,
                               nCores = 12,
                               analysis.type = "junctionsAndExons"
)

Also, in results, I get LFC values of merely A-B, A-C, A-D in results. Is there a way to set constrasts like in DESeq2. However, I get a weird Dispersion plot. What could be the possible reason? image

When I run two samples at a time like (B-D) or say (A-B) the exons and junctions follows the trend line: image

scseekers commented 3 years ago

Well! I also observed one weird thing. On comparing all samples together (A, B, C, D), I get a comparison (AvsB, AvsC, AvsD), and a single "genewiseResults.txt.gz" containing 213 Differentially Used Features (exons+junctions) at FDR of 0.01. However, when I run the Junctionseq to achieve, all vs all comparison using a FOR loop as shown below, I don't get any significant results at FDR of 0.05 or 0.01. Why such a discrepancy? My samples_dxd looks like (.csv file) Arep1,A Arep2,A Arep3,A Brep1,B Brep2,B Brep3,B Crep1,C Crep2,C Crep3,C Drep1,D Drep2,D Drep3,D

and my jseq_group looks like (.csv file) pair,group1,group2 A_B,A,B A_C,A,B C_B,C,B C_D,C,D B_D,B,D

for (i in 1:nrow(jseq_group)) {
samples_dxd <- subset(samplesheet, grepl(jseq_group$group1[i],samplesheet$condition) | grepl(jseq_group$group2[i], samplesheet$condition))
countFiles <- paste0("F:/",samples_dxd$sample.names,"/QC.spliceJunctionAndExonCounts.withNovel.forJunctionSeq.txt.gz")
jscs <- runJunctionSeqAnalyses(sample.files = countFiles,
                               sample.names = samples_dxd$sample.names,
                               condition=factor(samples_dxd$condition),
                               flat.gff.file = gtf.file,
                               nCores = 1,
                               analysis.type = "junctionsAndExons")
save(jscs, file = paste0(jseq_group$Pair[i],".Rdata"))
rm(jscs)
}

for (i in 1:nrow(jseq_group)) {
load(paste0(jseq_group$Pair[i],".Rdata"))
     writeCompleteResults(jscs,
                     outfile.prefix=jseq_group$Pair[i],
                     save.jscs = TRUE, FDR.threshold = 0.05)
     rm(jscs)
}