Raw data for eCO2mean/sd in SoilR data?

bob-carpenter commented 9 years ago

I'd like to build a hierarchical model of the raw data that was aggregated into eCO2mean and eCO2sd in the SoilR data set eCO2.

[x] Is the raw data available somewhere?
- [x] If so, can we share it?
- [x] If so, can you send it to me so I can upload it?
- [x] If we can't share it, can we use it internally and put it on a private repo?
  Background

In a hierarchical model, each replicate i gets its own parameter vector theta[i] (such as initial carbon and initial mixture between compartments or even decomposition and transfer rates). The item-specific random effects can be given a simple multivariate normal prior:

theta[i] ~ multi_normal(mu_theta, Sigma_theta);

So mu_theta gives you the population average for the parameters and Sigma_theta the covariance among the parameters. This unfolds the mean/sd measurement error model based on the aggregated data into something more flexible.

We'd put an informative prior on mu_theta, the mean population response, in the same way we set priors in the aggregated model. We can put an informative prior on the covariance Sigma_theta based on expectations for parameter scales, and we can use Stan's LKJ prior to concentrate mass to a flexible degree around a uniform correlation matrix.

crlsierra commented 9 years ago

See my answer to the previous issue.

Although I understand that it is better to use each replicate separately, we don't have the data in this format. I do have a large amount of files and scripts that takes raw data to replicate the values of eCO2mean and eCO2sd, but this may take more than a day of work to figure out. The work was done by a student who already left our lab.

I would suggest to try to find an alternative dataset to implement this hierarchical model. Not only is my dataset difficult to transform to individual measurements, but I also have concerns about the leaks we had in our jars. Maybe Charlotte or someone else from the workshop has a more useful dataset of incubation data.

bob-carpenter commented 9 years ago

OK. Thanks. That answers my main question about whether this came from replicated experiments.

Leaks in jars would be a great kind of measurement error to measure --- you could get a mixture model of leaky and non-leaky jars. Anyway, more of a stats geek problem than one that will matter going forward.

Bob

On Nov 27, 2014, at 6:18 AM, Carlos A. Sierra notifications@github.com wrote:

See my answer to the previous issue.

Although I understand that it is better to use each replicate separately, we don't have the data in this format. I do have a large amount of files and scripts that takes raw data to replicate the values of eCO2mean and eCO2sd, but this may take more than a day of work to figure out. The work was done by a student who already left our lab.

I would suggest to try to find an alternative dataset to implement this hierarchical model. Not only is my dataset difficult to transform to individual measurements, but I also have concerns about the leaks we had in our jars. Maybe Charlotte or someone else from the workshop has a more useful dataset of incubation data.

— Reply to this email directly or view it on GitHub.

soil-metamodel / stan

Raw data for eCO2mean/sd in SoilR data? #6

Background