Determine independence of primer bias

Summary

In PCR1, we use a multiplex reaction of 20 forward and 13 reverse primers. Each primer has a different amplification rate during the reaction. Is the amplification rate of a particular forward primer dependent on the identity of the reverse primer, or is it constant for all reverse primers?

Significance

If primer amplification bias is independent, we need to calculate 33 ei factors rather than 260.

To Do

Control Experiment
Determine independence via analysis (see Approach below)
Make recommendation based on results of (2).
Approach
Control Experiment (same as experiment in #7)
- Samples containing:
  - Constant spike concentration (To Do: what is this value?)
  - NO gDNA
- 20 total samples
Data Analysis and Interpretation
Per discussion on 25 May 2016, we will proceed with both a linear regression analysis, as well as the chi-squared analysis. Once both analyses are complete, we will compare and discuss results.
- Boxplots
  - grouped by V primer (20)
  - grouped by J primer (13)
- Linear Regression
  - additive vs. multiplicative model (test for interactions)
  - TapeStation dilution factor as co-variate
  - PCR batch as co-variate
  - Summarization of Linear Regression analysis results
- Chi squared independence test (From Burcu's and Terry's interchange)
  - Chi square independence test on a VxJ matrice (13X20) There will be 260 cells in the matrice.
  - Test the contribution of each cell by looking at the Pearson Residuals as well. Start with a few samples for which the counts are reasonably large (relative to the whole data set), and then form the 13x20 array of counts for each separately. For each of these, calculate expected E under the null hypothesis of independence

Additional details on the chi-squared approach:

E for a cell = row total x column total / grand total.

Letting O be the observed count in a cell, the 12x19 df. chi-squared statistic is sum over the 260 cells [(O-E)^2/E]. We are less interested in this and more in looking at the tables of Pearson residuals (O-E)/sqrt(E). Check whether there are patterns which suggest important deviations from independence. We don't really want to test the formal independence hypothesis, we want to know whether the deviations from independence are a) sufficiently large as to require paying attention, and b) interpretable, e.g. restricted to one or a few Fs or Rs.

We are hoping to see the pattern of residuals in one sample at a time, maybe even just the signs. Filling in the 260 boxes (for one sample) with colours as in a heat diagram would be useful too. Definite clear patterns could emerge from looking at the residuals of several individual samples.
Some boxplots are attempted. (attached). I thought seeing boxplots of [(O-E)^2/E](log base 10 scale) as well as Pearson Residuals ((O-E)/sqrt(E)) for every sample in the V*J matrix would be helpful. (170 data points in each boxplot) From the boxplots of the "squared pearson residuals "[(O-E)^2/E] ", we could see whether a particular V or J is producing higher values in general. From the boxplot of "pearson residuals "((O-E)/sqrt(E))", we could see the direction (negative or positive). Patterns could be evaluated by looking at a big printout. (At a first glance, tissue type affects the independence heavily. When looking at controls, especially J2-7 seems problematic (corresponding row shows more deviation from 0 than other rows)).
If there is primer dimerization, independence would be greatly affected! Dhaarini is repeating the experiment with lower primer concentration and with some additional precautions. This upcoming data would be much more reliable to work on.
Aforementioned boxplots in ChiSquare Independence #part:

ohsu-comp-bio / tcrseq_normalization