Closed JihedC closed 6 years ago
Hi Jihed,
Thanks for posting this question to the studygroup!
So what I understand from your post is that want to make a new data.frame with one variable per primer. So each observation will then have a variable with the Ct (?) value for primer pair 1, a variable for primer pair 2 etc..
Using spread to do this with a simplified example:
A <- data.frame( Name = c("Kees", "Leaf", "Kees", "Leaf"), Detector = c(1, 1, 10, 10), Ct = c(1,2,3,4))
spread(A, Detector, Ct)
Will generate a data.frame of 2 observations and 3 variables (Name, 1 and 10). The first argument is the data.frame you want to transform, the second argument is the variable you want to use for the column names (key) and the third argument sets the variable used to take the values from. Note that spread only works if there is a one-to-many relationship between the unique values in the untouched columns and the unique values in the keys and not if there are many-to-many relationships.
So, the following will not work:
A <- A <- data.frame( Name = c("Kees", "Kees", "Leaf", "Leaf", "Kees", "Leaf"), Detector = c(1, 1, 1, 1, 10, 10), Ct = c(1,2,3,4, 5, 6))
spread(A, Detector, Ct)
As this is the case with the data provide, you cannot use spread, because there is no way to know which "Ct" value of primer pair 1 belongs to which observation with the name "Kees" (there are multiple observations for "Kees" with primer pair 1).
Also in most cases R prefers a data.frame in the long-format (as you already have it) and not in wide-format (what you are trying to create), so maybe you can explain us what you want to do after this step? So we can advise you on the appropriate steps to get to your goal.
I hope this helps you a bit.
Joeri
Hi Joeri,
Thanks for such a quick reply! I just figure out after posting that spread() is not going to work that easily.
I have 12 samples of cDNA on a 96 wells plate for qPCR. For example the sample 'Kees' have 2 technical replicates for each set of primers, so in this first qPCR 'Kees' sample is found 8 times. They are the same sample but treated with different primers.
My goal here would be to :
I would like to do this because I will have many more samples and replication of the sample(observations) to make, and also many more gene to test. It would be much easier if I can get a script that automatically make the calculation and the plot.
There are a several script already made in R for this but with my basic level I don't really know if they can be adapted to the type of data we get from our machine, nor how to make it work.
My idea was to change the data in tidy format so calculation can easily be done, but I might be wrong. Any input would be a great help and if you want I can show you what I get when I spread the data.
Thanks for your help,
Jihed
Hey Jihed, I hope I understand all your problems correctly. For me, you have multiple problems in one here. To be able to go from qPCR results to plot, you need to do:
I would first try to solve the technical replicate problem. This will give you ideas and hints for the rest. Make a dummy data frame with only one primer pair first. For instance:
Well | Sample | Detector | Task | Ct | Std |
---|---|---|---|---|---|
A1 | Sample1 | GeneA | Unknown | 25.6 | 0.2 |
A2 | Sample2 | GeneA | Unknown | 26.6 | 0.2 |
A3 | Sample1 | GeneB | Unknown | 29.9 | 0.2 |
A4 | Sample2 | GeneB | Unknown | 31.0 | 0.2 |
Average the technical replicates for geneA and for geneB (using dplyr).
If you can do that, then the rest should be easier. You'll need to add extra columns to identify your reference gene (add a column with "target" and "reference" in there).
Let me know how it goes, Marc Ps: thanks for posting this issue here!
Hello everyone,
I would like to make a script in R in order to produces the qPCR analysis, I've found some script on bioconductor but they are way too complex for what I want to do. Here I am stocked already at the beginning :P
From the 7500 machine, I get a csv document
I have 4 different primers pairs here (each 24 wells), the detector variable can be used to identify the primers Detector == '1' is the 1st primer pair, Detector == '10' is second primer pair, ...
I am looking for a method to transform the data frame, in order to have the different primers as variable for the different samples (don't know if I am clear here ...).
For now I tried to filter into a new data frame using the detector value, for each primer.
Then to combine by column the 4 data frames, so as expected I am getting a big data frame of 24 observation and 32 variables, with most of the column being copied of one another and useless. I think it's wrong and can not be adapted automatically to different primers.
I would like to use the function spread() from dplyr but I don't really understand it.
If anyone has any suggestion to help me with this first step, I would be really happy.
Thanks,
Jihed