alexchwong / SpliceWiz

SpliceWiz is an R package for exploring differential alternative splicing events in splice-aware alignment BAM files.
Other
13 stars 8 forks source link

Performing differential analysis #32

Closed rezarahman12 closed 1 year ago

rezarahman12 commented 1 year ago

Dear Alex, Happy New Year 2023.

I have performed differential alternative splicing analysis using ASE_limma in SpliceWiz. However, the output table gives only intron retention (IR) results as the event type. I have given my code below for your kind consideration. I'll appreciate if you kindly take a look and suggest what I'm missing here. Thanks a lot. Best wishes, Reza

load required library

library(NxtIRFdata) library(SpliceWiz) library(limma)

vignette("SW_QuickStart", package = "SpliceWiz")

Building the SpliceWiz reference

gtf <- "gtf_fasta/Mus_musculus.GRCm39.108.gtf" genome <- "gtf_fasta/Mus_musculus.GRCm39.dna.primary_assembly.fa"

ref_path <-"ref_mg39"

buildRef( reference_path = ref_path, fasta = genome, gtf = gtf )

Process BAM files using SpliceWiz

bams <-findBAMS("bam_files/", level = 0)

pb_path <- "pb"

processBAM( bamfiles = bams$path, sample_names = bams$sample, reference_path = ref_path, output_path = pb_path, n_threads = 4 )

Collate the experiment

expr <- findSpliceWizOutput(pb_path)

nxtse_path <-"NxtSE_output"

collateData( Experiment = expr, reference_path = ref_path, output_path = nxtse_path

)

Importing the experiment

se <- makeSE(nxtse_path)

subset_samples <- colnames(se)[c(3,7,11,1,5,9)]

df <- data.frame(sample = subset_samples)

se_small <- makeSE(nxtse_path, colData = df, RemoveOverlapping = TRUE)

Differential analysis

colData(se)$condition <- rep(c("WT_ctr", "KD_ctr"), each = 3)

se.filtered <- se[applyFilters(se),]

Performing differential analysis

require("limma") res_limma <- ASE_limma( se = se.filtered, test_factor = "condition", test_nom = "WT_ctr", test_denom = "KD_ctr", )

write.table(res_limma,"splicedwiz_WT_ctr_KD_ctr_all.txt",quote=F,sep="\t",col.names = T,row.names = F)

alexchwong commented 1 year ago

hi Reza,

I'm not sure I understand what you are doing with se_small.

What is the output of table(rowData(se)$EventType) ? You should see a healthy mix of all alternative splicing types. If not, this indicates (for some reason) the genome doesn't have any non-IR alternative splicing events. You can verify this using viewASE(reference_path)

If you only get IR for table(rowData(se.filtered)$EventType) it means your filtering step has removed all the non-IR events for some reason.

rezarahman12 commented 1 year ago

Hi Alex Thank you so much for your quick response. I removed se_small from the script now.

I checked table(rowData(se)$EventType), and yes it has plenty mix of AS events. When I check table(rowData(se.filtered)$EventType), I also found mix of different AS events.

I have included the screenshot of the above two functions for your consideration. However, after differential analysis, the output table has only IR events, 125 rows only. Could you please advise me is it okay? image

alexchwong commented 1 year ago

I'm not certain why all non-IR events are removed. It may be possible that ASE_limma is removing events for which only 1 isoform is expressed (i.e. there is no alternative splicing happening).

I suggest you re-run the analysis using the unfiltered NxtSE object via edgeR:

require(edgeR)
res_edgeR <- ASE_edgeR(
se = se,
test_factor = "condition",
test_nom = "WT_ctr",
test_denom = "KD_ctr",
)

Please use the latest devel version of SpliceWiz - install this either from Bioconductor (it won't let you unless you are using Bioc devel in R 4.3) or from the github repo (main branch)

devtools::install_github("alexchwong/SpliceWiz")
rezarahman12 commented 1 year ago

Thanks very much, Alex. It worked for me.