pveber / morse

Companion R package for MOSAIC website
7 stars 5 forks source link

code and documentation mismatch in survDataCheck.R #267

Closed konkam closed 3 years ago

konkam commented 6 years ago

The help of the function states:

duplicateID there are two identical (replicate, conc, time) triplets

however the code tests that there are two identical (replicate, time) pairs.

ID <- idCreate(data) # ID vector if (any(duplicated(ID))) { msg <- paste("The (replicate, time) pair ", ID[duplicated(ID)], " is duplicated.", sep = "") errors <- msgTableAdd(errors, "duplicatedID", msg) }

The check for triplets may be more meaningful than the check for pairs, as some users may define replicate rather loosely as "batch of experiments" on a given day, or a given arm of the lab. Therefore they might have replicates with different concentrations.

However this is open to discussion. I think the current definition of replicate in the help of function survData precludes the ambiguity I described above:

replicate: a vector of class integer or factor for replicate identification. A given replicate value should identify the same group of individuals followed in time

virgile-baudrot commented 6 years ago

Thank you for noticing this. The commit c4a3f20 fix the mismatch: I keep the code and change the documentation.

About 'replicate', you are totally right: the new changes imply wrong replicate (every time-series is a replicate in a dataset, whatever are the profile of concentration). This comes from the fact that time-series were previously identified by the couple (concentration, replicate), but now, as we deal also with varying exposure profile were concentration may change in a single time-series, we need a label for every time-series which is 'replicate'. Also, since previously, column 'replicate' was required for pooling 'true replicate', this is now automatized.