Open abhisheksinghnl opened 8 years ago
You're interested in identifying features which are over-expressed at particular time points relative to the other time points?
Yes, indeed.
On Tue, Sep 20, 2016 at 1:05 PM, Ernest Turro notifications@github.com wrote:
You're interested in identifying features which are over-expressed at particular time points relative to the other time points?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/eturro/mmseq/issues/17#issuecomment-248270595, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAA575OO_QLi7zodT97ICOFbZLzE-P2ks5qr73ggaJpZM4KBa0d .
OK, in that case what I suggest is that you compare each time point to all the others in a standard differential expression analysis and then do a polytomous model comparison. To simplify, suppose you had t1, t2, t3 with 3 reps in each. You would have to create three different matrices files:
## matrices_file1:
# M; no. of rows = no. of observations
1 0 0
1 0 0
1 0 0
0 1 0
0 1 0
0 1 0
0 0 1
0 0 1
0 0 1
# C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 0
0 0
0 0
0 1
0 1
0 1
0 1
0 1
0 1
# P0(collapsed); no. of rows = no. of classes for model 0
1
# P1(collapsed); no. of rows = no. of classes for model 1
.5
-.5
## matrices_file2:
# M; no. of rows = no. of observations
1 0 0
1 0 0
1 0 0
0 1 0
0 1 0
0 1 0
0 0 1
0 0 1
0 0 1
# C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 1
0 1
0 1
0 0
0 0
0 0
0 1
0 1
0 1
# P0(collapsed); no. of rows = no. of classes for model 0
1
# P1(collapsed); no. of rows = no. of classes for model 1
.5
-.5
## matrices_file3:
# M; no. of rows = no. of observations
1 0 0
1 0 0
1 0 0
0 1 0
0 1 0
0 1 0
0 0 1
0 0 1
0 0 1
# C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 1
0 1
0 1
0 1
0 1
0 1
0 0
0 0
0 0
# P0(collapsed); no. of rows = no. of classes for model 0
1
# P1(collapsed); no. of rows = no. of classes for model 1
.5
-.5
You would run mmdiff three times.
mmdiff -m matrices_file1 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond1vsOthers.mmdiff
mmdiff -m matrices_file2 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond2vsOthers.mmdiff
mmdiff -m matrices_file3 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond3vsOthers.mmdiff
Once you're done, you can do a polytomous analysis in R:
source("/path/to/mmseq/src/mmseq.R")
res=polyclass(files=c("cond1vsOthers.mmdiff", "cond2vsOthers.mmdiff","cond3vsOthers.mmdiff"), prior=c(0.7,0.1,0.1,0.1)) # this would set the baseline model prior probability to 0.7 and that of the other models to 0.1
Now the res
object will contain everything you need, including the posterior probability for each model and the estimated differential expression "eta" parameters (to distinguish between up and down-regulation at each time point).
Hi,
Thank you, I will try to adopt these for 7 time points.
May I bother you in future incase I run into trouble. :D
Thank you
On Tue, Sep 20, 2016 at 1:16 PM, Ernest Turro notifications@github.com wrote:
OK, in that case what I suggest is that you compare each time point to all the others in a standard differential expression analysis and then do a polytomous model comparison. To simplify, suppose you had t1, t2, t3 with 3 reps in each. You would have to create three different matrices files: matrices_file1: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5 matrices_file2: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 1 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5 matrices_file3: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5
You would run mmdiff three times.
mmdiff -m matrices_file1 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond1vsOthers.mmdiff
mmdiff -m matrices_file2 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond2vsOthers.mmdiff
mmdiff -m matrices_file3 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond3vsOthers.mmdiff
Once you're done, you can do a polytomous analysis in R:
source("/path/to/mmseq/src/mmseq.R")
res=polyclass(files=c("cond1vsOthers.mmdiff", "cond2vsOthers.mmdiff","cond3vsOthers.mmdiff"), prior=c(0.7,0.1,0.1,0.1)) # this would set the baseline model prior probability to 0.7 and that of the other models to 0.1
Now the res object will contain everything you need, including the posterior probability for each model and the estimated differential expression "eta" parameters (to distinguish between up and down-regulation at each time point).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/eturro/mmseq/issues/17#issuecomment-248272692, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAA58dc2RA17aeCCeo-bRmFk51XWCd7ks5qr8CGgaJpZM4KBa0d .
Hi,
I have one more question, as their a way to control the number of processors that mmdiff uses? by default it is using all the available threads.
Thank you
Best regards Abhishek
On Tue, Sep 20, 2016 at 1:22 PM, Abhishek Singh abhisheksinghnl@gmail.com wrote:
Hi,
Thank you, I will try to adopt these for 7 time points.
May I bother you in future incase I run into trouble. :D
Thank you
On Tue, Sep 20, 2016 at 1:16 PM, Ernest Turro notifications@github.com wrote:
OK, in that case what I suggest is that you compare each time point to all the others in a standard differential expression analysis and then do a polytomous model comparison. To simplify, suppose you had t1, t2, t3 with 3 reps in each. You would have to create three different matrices files: matrices_file1: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5 matrices_file2: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 1 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5 matrices_file3: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5
You would run mmdiff three times.
mmdiff -m matrices_file1 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond1vsOthers.mmdiff
mmdiff -m matrices_file2 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond2vsOthers.mmdiff
mmdiff -m matrices_file3 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond3vsOthers.mmdiff
Once you're done, you can do a polytomous analysis in R:
source("/path/to/mmseq/src/mmseq.R")
res=polyclass(files=c("cond1vsOthers.mmdiff", "cond2vsOthers.mmdiff","cond3vsOthers.mmdiff"), prior=c(0.7,0.1,0.1,0.1)) # this would set the baseline model prior probability to 0.7 and that of the other models to 0.1
Now the res object will contain everything you need, including the posterior probability for each model and the estimated differential expression "eta" parameters (to distinguish between up and down-regulation at each time point).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/eturro/mmseq/issues/17#issuecomment-248272692, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAA58dc2RA17aeCCeo-bRmFk51XWCd7ks5qr8CGgaJpZM4KBa0d .
Hi,
I got it, it is OMP_NUM_THREADS
Best Abhishek
On Thu, Sep 22, 2016 at 9:26 AM, Abhishek Singh abhisheksinghnl@gmail.com wrote:
Hi,
I have one more question, as their a way to control the number of processors that mmdiff uses? by default it is using all the available threads.
Thank you
Best regards Abhishek
On Tue, Sep 20, 2016 at 1:22 PM, Abhishek Singh <abhisheksinghnl@gmail.com
wrote:
Hi,
Thank you, I will try to adopt these for 7 time points.
May I bother you in future incase I run into trouble. :D
Thank you
On Tue, Sep 20, 2016 at 1:16 PM, Ernest Turro notifications@github.com wrote:
OK, in that case what I suggest is that you compare each time point to all the others in a standard differential expression analysis and then do a polytomous model comparison. To simplify, suppose you had t1, t2, t3 with 3 reps in each. You would have to create three different matrices files: matrices_file1: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5 matrices_file2: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 1 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5 matrices_file3: M; no. of rows = no. of observations
1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5
You would run mmdiff three times.
mmdiff -m matrices_file1 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond1vsOthers.mmdiff
mmdiff -m matrices_file2 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond2vsOthers.mmdiff
mmdiff -m matrices_file3 cond1_rep1.mmseq cond1_rep2.mmseq cond1_rep3.mmseq cond2_rep1.mmseq cond2_rep2.mmseq cond2_rep3.mmseq cond3_rep1.mmseq cond3_rep2.mmseq cond3_rep3.mmseq > cond3vsOthers.mmdiff
Once you're done, you can do a polytomous analysis in R:
source("/path/to/mmseq/src/mmseq.R")
res=polyclass(files=c("cond1vsOthers.mmdiff", "cond2vsOthers.mmdiff","cond3vsOthers.mmdiff"), prior=c(0.7,0.1,0.1,0.1)) # this would set the baseline model prior probability to 0.7 and that of the other models to 0.1
Now the res object will contain everything you need, including the posterior probability for each model and the estimated differential expression "eta" parameters (to distinguish between up and down-regulation at each time point).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/eturro/mmseq/issues/17#issuecomment-248272692, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAA58dc2RA17aeCCeo-bRmFk51XWCd7ks5qr8CGgaJpZM4KBa0d .
Hi,
I am getting this error (below), while running mmdiff for 3 groups each with 3 replicates.
Design matrix for model 0 ([1|M]): Error: collinearity in combined matrix of intercept and covariates for model 0
The command that I am using is :
~/Softwares/MMseq/mmseq/bin/mmdiff-linux -m mat1.txt /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_1.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_2.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_3.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_1.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_2.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_3.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_1.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_2.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_3.namesorted.bam.hits.gene.mmseq > E18vsOthers.mmdiff
and this is what happens:
Min unique hits fraction for normalisation: 1
No. threads: 16
Parsing /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_1.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_2.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_3.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_1.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_2.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_3.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_1.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_2.namesorted.bam.hits.gene.mmseq /media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_3.namesorted.bam.hits.gene.mmseq
Analysing 32637 features
Using 12943/32637 features for normalisation.
Log scale normalisation factors:
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_1.namesorted.bam.hits.gene.mmseq -0.0922577
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_2.namesorted.bam.hits.gene.mmseq 0.157586
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/E18/E18_3.namesorted.bam.hits.gene.mmseq -0.309913
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_1.namesorted.bam.hits.gene.mmseq 0.370167
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_2.namesorted.bam.hits.gene.mmseq 0.127327
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P1/P1_3.namesorted.bam.hits.gene.mmseq -0.515323
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_1.namesorted.bam.hits.gene.mmseq 0.04334
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_2.namesorted.bam.hits.gene.mmseq 0.1999
/media/data/u0106457/FASTQ/Rat/Ageing/AlignedBowtie1/P3/P3_3.namesorted.bam.hits.gene.mmseq 0.153414
Design matrix for model 0 ([1|M]):
Error: collinearity in combined matrix of intercept and covariates for model 0
Could you please let me know, how to fix this.
Thank you
Best regards Abhishek
Hi @eturro,
I went into this post hoping that it specifies a design matrix for a estimating the effect of time on expression levels rather than all pair-wise certain time-point vs. baseline time-point analyses. But perhaps more generally, can you think of a way to translate an R syntax lm or model matrix into an mmdiff design matrix?
Thanks a lot
If you specify the linear model you're after I can help you with the mmdiff matrices file.
For this specific question I'm interested in two cases:
Is this matrix (for 3 samples from 3 time points) the correct one:
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 1 0 1 0 2 0 2 0 2
1
.5 -.5
And then the second case is having two groups: treatment and control and, n samples from each per each of the m from time points, and I'm interested in estimating effect of time on the treatment vs. control fold-change, which I guess is simply the treatment x time interaction term.
Hello The number of rows of the P matrix must always correspond to the number of levels in the corresponding C column, so what you have shown is not valid (two rows for P1 but four levels for C[,2]).
Do you want to assume that that the change in expression between time point 0 and time point 1 is the same as between time point 1 and time point 2?
In that case you could use the following and eta1_0 would represent the log fold change between time point 0 and time point 2 (twice the log fold changes between time points 0 and 1 or 1 and 2):
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 1 0 1 0 2 0 2 0 2
1 P1(collapsed); no. of rows = no. of classes for model 1 -.5 0 .5
If you would like to introduce a treatment and a treatment x time interaction, then I would compare multiple models and use polyclass() to perform polytomous model selection. Let's assume you have two reps in each group.
Baseline model (model 0): global mean.
Comparison with model 1: time effect ( as above ):
0 # untreated t0 0 # untreated t0 0 # untreated t1 0 # untreated t1 0 # untreated t2 0 # untreated t2 0 # treated t0 0 # treated t0 0 # treated t1 0 # treated t1 0 # treated t2 0 # treated t2
0 0 0 0 0 1 0 1 0 2 0 2 0 0 0 0 0 1 0 1 0 2 0 2
1 P1(collapsed); no. of rows = no. of classes for model 1 -.5 0 .5
Comparison with model 2: time effect and a treatment effect:
0 # untreated t0 0 # untreated t0 0 # untreated t1 0 # untreated t1 0 # untreated t2 0 # untreated t2 0 # treated t0 0 # treated t0 0 # treated t1 0 # treated t1 0 # treated t2 0 # treated t2
0 0 0 0 0 1 0 1 0 2 0 2 0 3 0 3 0 4 0 4 0 5 0 5
1 P1(collapsed); no. of rows = no. of classes for model 1 -.5 -.5 0 -.5 .5 -.5 -.5 .5 0 .5 .5 .5
Comparison with model 3: treatment-specific time effect and a treatment effect:
0 # untreated t0 0 # untreated t0 0 # untreated t1 0 # untreated t1 0 # untreated t2 0 # untreated t2 0 # treated t0 0 # treated t0 0 # treated t1 0 # treated t1 0 # treated t2 0 # treated t2
0 0 0 0 0 1 0 1 0 2 0 2 0 3 0 3 0 4 0 4 0 5 0 5
1 P1(collapsed); no. of rows = no. of classes for model 1 -.5 0 -.5 0 0 -.5 .5 0 -.5 0 -.5 .5 0 0 .5 0 .5 .5
I hope the interpretations of the estimated eta1_* coefficients is straightforward.
best wishes Ernest
On 18 Jul 2017, at 17:45, ndrubins notifications@github.com wrote:
For this specific question I'm interested in two cases:
I have n samples for m time points (not from them same subjects hence no autocorrelation), and I want to estimate the effect of time on expression. Is this matrix (for 3 samples from 3 time points) the correct one:
M; no. of rows = no. of observations
0 0 0 0 0 0 0 0 0
C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 0 0 0 0 0 0 1 0 1 0 1 0 2 0 2 0 3
P0(collapsed); no. of rows = no. of classes for model 0
1
P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5
And then the second case is having two groups: treatment and control and, n samples from each per each of the m from time points, and I'm interested in estimating effect of time on the treatment vs. control fold-change, which I guess is simply the treatment x time interaction term.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/eturro/mmseq/issues/17#issuecomment-316124755, or mute the thread https://github.com/notifications/unsubscribe-auth/AEM2EOEgmkVRQXFQBRcIvXkplJSzdYtwks5sPOESgaJpZM4KBa0d.
Thanks a lot
On Wed, Jul 19, 2017 at 8:43 AM, Ernest Turro notifications@github.com wrote:
Hello The number of rows of the P matrix must always correspond to the number of levels in the corresponding C column, so what you have shown is not valid (two rows for P1 but four levels for C[,2]).
Do you want to assume that that the change in expression between time point 0 and time point 1 is the same as between time point 1 and time point 2?
In that case you could use the following and eta1_0 would represent the log fold change between time point 0 and time point 2 (twice the log fold changes between time points 0 and 1 or 1 and 2):
M; no. of rows = no. of observations
0 0 0 0 0 0 0 0 0
C; no. of rows = no. of observations and no. of columns = 2 (one for
each model) 0 0 0 0 0 0 0 1 0 1 0 1 0 2 0 2 0 2
P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1 -.5 0 .5
If you would like to introduce a treatment and a treatment x time interaction, then I would compare multiple models and use polyclass() to perform polytomous model selection. Let's assume you have two reps in each group.
Baseline model (model 0): global mean.
Comparison with model 1: time effect ( as above ):
M; no. of rows = no. of observations
0 # untreated t0 0 # untreated t0 0 # untreated t1 0 # untreated t1 0 # untreated t2 0 # untreated t2 0 # treated t0 0 # treated t0 0 # treated t1 0 # treated t1 0 # treated t2 0 # treated t2
C; no. of rows = no. of observations and no. of columns = 2 (one for
each model) 0 0 0 0 0 1 0 1 0 2 0 2 0 0 0 0 0 1 0 1 0 2 0 2
P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1 -.5 0 .5
Comparison with model 2: time effect and a treatment effect:
M; no. of rows = no. of observations
0 # untreated t0 0 # untreated t0 0 # untreated t1 0 # untreated t1 0 # untreated t2 0 # untreated t2 0 # treated t0 0 # treated t0 0 # treated t1 0 # treated t1 0 # treated t2 0 # treated t2
C; no. of rows = no. of observations and no. of columns = 2 (one for
each model) 0 0 0 0 0 1 0 1 0 2 0 2 0 3 0 3 0 4 0 4 0 5 0 5
P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1 -.5 -.5 0 -.5 .5 -.5 -.5 .5 0 .5 .5 .5
Comparison with model 3: treatment-specific time effect and a treatment effect:
M; no. of rows = no. of observations
0 # untreated t0 0 # untreated t0 0 # untreated t1 0 # untreated t1 0 # untreated t2 0 # untreated t2 0 # treated t0 0 # treated t0 0 # treated t1 0 # treated t1 0 # treated t2 0 # treated t2
C; no. of rows = no. of observations and no. of columns = 2 (one for
each model) 0 0 0 0 0 1 0 1 0 2 0 2 0 3 0 3 0 4 0 4 0 5 0 5
P0(collapsed); no. of rows = no. of classes for model 0
1 P1(collapsed); no. of rows = no. of classes for model 1 -.5 0 -.5 0 0 -.5 .5 0 -.5 0 -.5 .5 0 0 .5 0 .5 .5
I hope the interpretations of the estimated eta1_* coefficients is straightforward.
best wishes Ernest
On 18 Jul 2017, at 17:45, ndrubins notifications@github.com wrote:
For this specific question I'm interested in two cases:
I have n samples for m time points (not from them same subjects hence no autocorrelation), and I want to estimate the effect of time on expression. Is this matrix (for 3 samples from 3 time points) the correct one:
M; no. of rows = no. of observations
0 0 0 0 0 0 0 0 0
C; no. of rows = no. of observations and no. of columns = 2 (one for each model)
0 0 0 0 0 0 0 1 0 1 0 1 0 2 0 2 0 3
P0(collapsed); no. of rows = no. of classes for model 0
1
P1(collapsed); no. of rows = no. of classes for model 1
.5 -.5
And then the second case is having two groups: treatment and control and, n samples from each per each of the m from time points, and I'm interested in estimating effect of time on the treatment vs. control fold-change, which I guess is simply the treatment x time interaction term.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/eturro/mmseq/issues/17#issuecomment-316124755>, or mute the thread https://github.com/notifications/unsubscribe-auth/ AEM2EOEgmkVRQXFQBRcIvXkplJSzdYtwks5sPOESgaJpZM4KBa0d.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eturro/mmseq/issues/17#issuecomment-316428822, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGsHsDBJ66JrhNWKV6gGhw17ZeR7e0Fks5sPiQ-gaJpZM4KBa0d .
Hi,
I have same sample for 7 different time points and each with replicates. I would like to find out gene clusters for each time point that define that time point (gene signatures).
Here is the table of replicates and time points:
The matrix that I am using is:
The command that I am using with mmdiff is :
`~/Softwares/MMseq/mmseq-latest/bin/mmdiff-linux -fixalpha -p 0 -T0_1.namesorted.bam.hits.gene.mmseq T0_2.namesorted.bam.hits.gene.mmseq T0_3.namesorted.bam.hits.gene.mmseq T0_4.namesorted.bam.hits.gene.mmseq T1_1.namesorted.bam.hits.gene.mmseq T1_2.namesorted.bam.hits.gene.mmseq T1_3.namesorted.bam.hits.gene.mmseq T2_1.namesorted.bam.hits.gene.mmseq T2_2.namesorted.bam.hits.gene.mmseq T2_3.namesorted.bam.hits.gene.mmseq T3_1.namesorted.bam.hits.gene.mmseq T3_2.namesorted.bam.hits.gene.mmseq T3_3.namesorted.bam.hits.gene.mmseq T4_1.namesorted.bam.hits.gene.mmseq T4_2.namesorted.bam.hits.gene.mmseq T4_3.namesorted.bam.hits.gene.mmseq T4_4.namesorted.bam.hits.gene.mmseq T5_1.namesort.bam.hits.gene.mmseq T5_2.namesort.bam.hits.gene.mmseq T5_3.namesort.bam.hits.gene.mmseq T5_4.namesort.bam.hits.gene.mmseq T6_1.namesort.bam.hits.gene.mmseq T6_2.namesort.bam.hits.gene.mmseq T6_3.namesort.bam.hits.gene.mmseq T6_4.namesort.bam.hits.gene.mmseq T6_5.namesort.bam.hits.gene.mmseq T6_6.namesort.bam.hits.gene.mmseq T6_7.namesort.bam.hits.gene.mmseq > out.mmdiff_p0_fixalpha
` My first question, is everything correct till here ? if not could you please edit.
Second question: Once I have the output file from mmdiff, how to proceed to get clusters of time stage specific genes? I mean, how can I filter them out from the mmdiff output file or is it already done?
Could you please find sometime to answer these and guide me in my analysis.
Thank you.
Best regards Abhishek