eturro / mmseq

Haplotype, isoform and gene level expression analysis using multi-mapping RNA-seq reads
GNU General Public License v2.0
67 stars 20 forks source link

Pairing samples #27

Open tataderby opened 6 years ago

tataderby commented 6 years ago

Hello,

I have ribosome profiling data from two conditions: treatment and control. For both treatment and control I have 3 RNA-seq libraries (quantified by MMSEQ) of the ribosome RNA footprints and 3 samples of the total RNA, where the footprints and total are matched with respect to the animals they came from, meaning tissue from each sample is split to two: one for the ribosome profiling protocol and the other for the total RNA protocol, hence are paired.

I would like to test if the ribosome occupancy is different between the treatment and control. From the examples in the README page it seems that the case of assessing if the log fold change between group A and group B is different from the fold change between group C and group D is the appropriate one, where A and B would be ribosome footprints and total RNA from the treatment group, respectively, and the same for C and D for the control group. The only thing this doesn't seem to account for is the fact that the samples in A and B (and in C and D) are paired.

Should that information be encoded in M matrix? (each animal is given and integer ranging from 0 to the number of animals minus one?), or is there another way to utilize the sample pairing?

Thanks a lot Tata

eturro commented 6 years ago

Dear Tata

I think what you're after is a "random slope" model, whereby each animal has its own log fold change centred on a different mean for treatment and control.

You cannot model this with mmdiff. What you are proposing to do is to model a fixed effect representing a constant difference in log fold change due to treatment. This may be slightly underpowered but I think it is the closest you can get to the ideal "random slope" model. If the variability between mice within fraction and treatment is small relative to the difference in log fold change due to treatment, then this model should work relatively well.

Best wishes Ernest

On 24 Mar 2018, at 16:48, tataderby notifications@github.com wrote:

Hello,

I have ribosome profiling data from two conditions: treatment and control. For both treatment and control I have 3 RNA-seq libraries (quantified by MMSEQ) of the ribosome RNA footprints and 3 samples of the total RNA, where the footprints and total are matched with respect to the animals they came from, meaning tissue from each sample is split to two: one for the ribosome profiling protocol and the other for the total RNA protocol, hence are paired.

I would like to test if the ribosome occupancy is different between the treatment and control. From the examples in the README page it seems that the case of assessing if the log fold change between group A and group B is different from the fold change between group C and group D is the appropriate one, where A and B would be ribosome footprints and total RNA from the treatment group, respectively, and the same for C and D for the control group. The only thing this doesn't seem to account for is the fact that the samples in A and B (and in C and D) are paired.

Should that information be encoded in M matrix? (each animal is given and integer ranging from 0 to the number of animals minus one?), or is there another way to utilize the sample pairing?

Thanks a lot Tata

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.