Open JamesHWade opened 2 years ago
My plan for now is to use roles of "descriptor" from variables that describe or otherwise identify the samples. In many cases, these could be outcome variables to model. For experimental variables, I went with "conditions". These would be pretty easy to change at this point, and I'm very open to feedback on them.
From the RStudio Community post...
For these types of data, the rows in the data set are not going to be independent. The independent experimental unit will be something like the sample of material and there will be many other types of variables.
I think that the important part of this project is to differentiate the different classes of variables.
Some terminology I just made up:
technical variables: associated with the type of raw data coming off of the instrument, such as the wavelength, time, etc.
sample based columns/identifiers: these are going to define the subset of data that should be processed. Examples might be patient, day, aliquot/subsample, etc.
experimental conditions: these might affect preprocessing or might just be lumped into the sample-based variables. They reflect assay conditions such as fractionation identifiers, (HPLC) column, reagents, etc.
I think that the most help we need is on identifying the technical variables for different types of assays.
Here's an example with Raman spectroscopy:
technical variables: intensity (the assay measurement) and wavelength.
sample variables: day.
experimental variables: reactor size.
Once we have an idea of the technical variables, the actual recipe parts are pretty straight-forward (as are the preprocessing methods).