Differential Expression for SLAM-seq

aksenovavy commented 4 years ago

For Differential Expression analysis do you use TcReadCount/ReadCount or ConversionsOnTs/CoverageOnTs as input? What will be the difference in outputs for fold change for these two inputs?

t-neumann commented 4 years ago

Hi,

for now we use the plain TcReadCounts as input for DE analysis e.g. using DESeq2. Only difference to standard workflows is that you would calculate the sizeFactors on the total ReadCounts for global effects.

aksenovavy commented 4 years ago

Thank you, Tobias. Do I understand right that Ts reads will be normalized later to total (Ts>C + non-Ts) ReadCounts not total Ts>C counts per sample? You mentioned your collaborator performed mRNA turnover estimation. Do you know if they deposit code or who will be the best person to contact?

aksenovavy commented 4 years ago

Could you please also specify what files can be uses to make plots similar to Fig.3 or Suppl.Fig.11 from Herzog et al., 2017? We want to plot our SLAM-seq data as well. Was it sorted.bam files or bed.graph files? Many thanks!

t-neumann commented 4 years ago

Hi @aksenovavy sorry for the late reply:

Yes we actually use the TcReadCount column for DESeq2 analysis and the ReadCount column only for the global normalization when calculating the sizeFactors. I do not think the code was deposited, but the main author Veronika Herzog already answered this question a couple of times: I would simply paste her reply here, maybe that already helps.

thanks for your interest in SLAMseq. For the half-life calculations, we used a simple exponential decay model using the Levenberg-Marquardt algorithm. In R you can do this using the minpack.lm package. Y0 was set to 1 (as all the data points were normalised to the timepoint of the chase onset for each gene; as you describe it yourself) and the plateau was set to 0, see formula below:

model = nlsLM(y~Plat + (y0 - Plat) * exp(-k * (timepoints)),
                   start=list(
                     y0= 1,
                     Plat = 0,
                     k= 0.5),

                   upper = c(1,0,Inf),
                   lower = c(1,0,0),
                   control = nls.lm.control(maxiter = 1000),
                   na.action = na.omit)

I hope this will help you in your analysis. Good luck with your experiments.

For making plots like Fig3 or Suppl Fig 11 you can use alleyoop read-separator and then create bigwig files from the resulting bam files.

EllieDuan commented 3 years ago

Hello,

Thank you for the nice tool SLAM-dunk, it runs very smoothly!

I'm wondering if you have an example code of how you use DESeq2 that using the ReadCount column for sizeFactor and TcReadCount column for analysis? I'm following the standard DESeq2 guidelines, and only able to input 1 matrix with all TcReadCount data, so I'm wondering how you input both? Thank you for your help!

I also followed your nature protocols 56-60 steps: https://www.nature.com/articles/s41596-019-0179-x#Sec32, but this might be the old version, and step 61 did not use DESeq2 to call differential.

Thanks~!

Best, Ellie

EllieDuan commented 3 years ago

I guess I figured this out by running DESeq2 for both ReadCount and TcReadCount, then assign the size factor from Total to TcreadCount: something like this: sizeFactors(TcReadCount) <- sizeFactors(ReadCount)

t-neumann commented 3 years ago

Hi @EllieDuan - yes exactly that's what we also do on our end

zsun89 commented 3 years ago

Hi, I incorporated ERCC spike-in RNA in my sample for normalization against global effects of transcription. I was wondering what is the recommended normalization approach to take ERCC spike-in into account during the DESeq2 differential expression analysis. Would it be appropriate to generate size factors using read counts exclusively from the ERCC spike-in, and then feed them to the analysis using tcReadCounts as size factors? Thanks, Zhen

t-neumann commented 3 years ago

Hi @zsun89,

if you look at global changes in transcriptional output, then you should use the total read counts as size factors as described by Muhar et al. I don't really see an application for ERCC spike-ins here unless you want to look at global steady-state changes.

zsun89 commented 3 years ago

Hi @t-neumann, Thank you for your quick reply. Besides using the approach described by Muhar et al, I'm also interested in testing calculating nascent RNA production on a per cell basis, where ERCC RNA was added reflecting the input cell number. My understanding it that the Muhar approach is accurate if there is no global steady-state RNA changes, which is used for size factor calculation. However, I was concerned that, at the time of analysis (RNA isolation), a global reduction in new mRNA synthesis may have already led to a reduction in total steady state mRNA levels. For examples, presumably in later time points after BRD4 degradation (in the paper the authors focused on very early time points after BRD4 loss when total RNA likely have not changed despite reduced nascent transcription). If this is the case (reduction in steady-state RNA), is the Muhar approach still sufficient to accurately calculate differential nascent RNA levels? Or it may lead to an underestimation of the degree of global reduction in nascent RNA, which would call for an additional normalization for the steady state RNA, e.g. using ERCC spike-in. Please correct me if my understanding of the global normalization is wrong. Thanks, Zhen

t-neumann commented 3 years ago

Hi Zhen,

I think your reasoning makes sense. In this situation, probably you have to once calculate size factors based on the ERCC-spikeins and then on top size factors on your steady state read sets and combine them. I don't have it in the back of my head what range those size factors have, probably it's a simple multiplication. What do you think?

zsun89 commented 3 years ago

Hi @t-neumann, Thank you Tobias for helping looking into this. The size factors calculated based on ERCC ReadCounts alone or steady state ReadCounts alone are actually similar, all closely around 1 (~0.9-1.1) (suggesting no dramatic changes at global levels for steady state RNA). I guess in this case I could combine the two sets of size factors simply by a multiplication of matching ones? Thanks, Zhen

t-neumann commented 3 years ago

Yes I would agree so.

zsun89 commented 3 years ago

Thank you. I really appreciate your feedbacks!

t-neumann / slamdunk

Differential Expression for SLAM-seq #70