jpromeror / EventPointer

R package for the identification and statistical analysis of alternative splicing events using junction arrays or RNASeq data
4 stars 0 forks source link

Question about multiple contrasts #2

Closed mforde84 closed 7 years ago

mforde84 commented 7 years ago

Hi Juan,

I'm trying to figure out how to set up multiple contrasts. For instance, say I have 4 different tissue types, and I'd like to compare 1vsOthers and all 1vs1 groups.

So given:

dmatrix<-cbind( Group1=c(1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Group2=c(0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0), Group3=c(0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0), Group4=c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1))

In this instance there isn't an intercept, and I want to contrast something like 1vsOthers. For example, would Group1 - (Group2+Group3+Group4)/3) be equal to the following contrast matrix:

Cmatrix<-t(t(c(1,-0.33,-0.33,-0.33)))

Also I would like to do all pairwise contrast for 1vs1 (eg. Group1 - Group2) as well. Would this be equal to:

Cmatrix<-t(t(c(1,-1,0,0)))

I tested what I thought might be the intercept for each group:

Cmatrix<-t(t(c(1,0,0,0)))

However, the results are not significant but should be, since all of the data points are above a mean of 0. So my thought is that this isn't the intercept, and I'm not entirely sure that the input for your wrapper is consistent with the style of input for contrasts.fit().

Again any guidance on this is much appreciated,

Marty

jpromeror commented 7 years ago

Hello Martin,

The algorithm can handle multiple contrasts, if you give as an input a Contrast matrix with more than one column, the output is a list in which each element corresponds to the columns of the given matrix.

With the issue about the contrast in Cmatrix<-t(t(c(1,0,0,0))), just to be sure. You are using the standard statistical test right? I will get more insight into this issue. Thanks for the comments

Juan Pablo

jpromeror commented 7 years ago

Martin,

The reason why the contrast t(t(c(1,0,0,0))) isn't significant is due to the statistical testing EventPointer does, I will try to explain what I mean.

In a simple linear model, with the Design and Contrast matrices you provide, the contrast t(t(c(1,0,0,0))) just tests if the expression on Group 1 is different from zero. As you state, every value is positive and above a mean of 0 so the results should be significant. However, EventPointer does a series of transformations to perform the statistical tests.

What EventPointer tests is that the isoform mapped to Path 1 and the isoform mapped to Path 2 have both a large fold change AND that it is is opposite directions, this way we can say that there is differential splicing between conditions.

After transforming the contrast you provide, what we are internally testing is the expression of Path 1 in Group 1 and the expression of Path 2 in Group 1. Again, both values are > 0 and the results are significant. I've tried this out and the results are significant. However, when we summarize the pvalues from both Paths we force that they have opposite directions, so as both have positive values, the summarization in turn yields a non significant pvalue, meaning that there is no alternative splicing.

I would recommend to read the paper of EventPointer http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-2816-x where the statistical testing is shown so you can have a better understanding.

I hope this clears out any doubts.

Juan Pablo

mforde84 commented 7 years ago

Great. Thank you for clarifying that issue for me. I should have noticed the answer in the paper, so sorry for the silly question.

Now if I understand correctly, to generate the design and contrast matrices we can essentially use the same methodology used for limma, and the issue I was having is just a misunderstanding of the actual modelling. So in theory the following would perform all of the planned contrasts for (Grp vs All others):


groups <- c(rep("liver",4),rep("muscle",4),rep("spleen",4),rep("testes",4))
Dmatrix <- model.matrix(~0 + groups)
colnames(Dmatrix) <- c("liver","muscle","spleen","testes")
Cmatrix <- makeContrasts(
    liver=liver-(muscle+spleen+testes)/3, 
    muscle=muscle-(liver+spleen+testes)/3, 
    spleen=spleen-(muscle+liver+testes)/3,
    testes=testes-(muscle+spleen+liver)/3,
    levels=Dmatrix)
Events<-EventPointer(Design=Dmatrix,
    Contrast=Cmatrix,
    ExFit=ExFit,
    Eventstxt=EventsFound,
    Filter=FALSE,
    Qn=0.25,
    Statistic="LogFC",
    PSI=TRUE)
coef1_liver = Events[1] 
coef2_muscle = Events[2] 
coef3_spleen = Events[3] 
coef4_testes = Events[4] 
jpromeror commented 7 years ago

Yes, the design and contrast matrices are done in exactly the same way as in limma. In fact, internally EventPointer uses limma to perform the analysis.

The code you provide will perform the analysis in the 4 contrasts. Just take into account that the output is a list so you can access the different elements like this:

coef1_liver = Events[[1]] 
coef2_muscle = Events[[2]] 
coef3_spleen = Events[[3]] 
coef4_testes = Events[[4]] 

Using double brackets to access the different slots.

Thanks for the comments!

I hope the algorithm helps you in the projects your are doing.

mforde84 commented 7 years ago

Thanks!