MauerLab / DJExpress

Other
6 stars 1 forks source link

Normalisation before analysis? #1

Open Cortalak opened 2 years ago

Cortalak commented 2 years ago

Hi, thanks for the wonderful workflow. I am normally using leafcutter for junction analysis but this package complements really well.

I just wanted to ask if the junction count should be normalized with lets say a simple library normalisation or does the package normalise internally in one of the steps?

Thank you for your help

Cheers

Akira

MauerLab commented 2 years ago

Hi Akira,

My apologies for the late response, for some reason we missed the notification of your message. Regarding your question, during step 4: Test for Differential Junction Expression using DJEanalyze(), the function uses limma-based statistical methods to normalize the junction read counts, as explained in the methods of our publication:

DJExpress first tests for differential expression of genomic features (here splice junction regions) using an initial input matrix of read count values as rows and sample ids as columns. Count data is then transformed to log2-counts per million (logCPM), and observation-level weights based on mean-variance relationship are computed (using the voom function from limma). Users can decide at this point whether to keep the default expression threshold for filtering junctions prior to hypothesis testing (10 minimum of read count mean per junction) or to adjust the threshold based on the mean-variance trend. A linear model is then fit per junction using a provided experimental design, and empirical Bayes moderated t-statistics are implemented to assess the significance level of the observed expression changes.

As such, you don't need to implement any prior normalization, as it is internally happening when running DJEanalyze function using prep.out output object from DJEprepare().

Cheers,

Lina.