Closed jakobnissen closed 3 years ago
Thanks for asking for clarification, I'm always interested in improving the docs! It's too bad 0.7.0 isn't ready in time for your students 😕.
The method in question can be thought of as replacing the pool of pop names. This isn't the best method bc it's contingent on meta being sorted, which isn't always the case. So, if unique(pdata.meta.population)
gives you, say, 3 elements, this method has you input a vector of length 3 to replace those pops. I'm on my phone, so I can't check it fully, but I believe internally the vector is used to create a dictionary of old => new and run the dictionary method, which is the preferred. The docs will be amended for 0.7.0 to reflect this. I'll keep this issue open until the release.
If your students find anything else, please open more issues!
Not that it's helpful now, but 0.7.0 checks the length of the input vector for that method and decides whether to replace the unique values (like I explain above) or the values per sample (as you've described). I'll make sure to exhaustively document that.
To replace or ignore missing
, the recommended way is here, which is populations!(PopData, samplenames, samplepops)
. This method takes a vector of sample names and a vector of their new population ID's.
0.7.0
is pending release, and as soon as it's merged into the General Registry, I will rebuild the docs with the updates we've discussed. Thanks :smile:
It's quite unclear what it actually does. Currently, it says:
, which all of my students took to mean that the input should be a vector of the same length as the dataframe, such that the n'th entry in the vector became the name of the n'th sample in the dataframe.
Two suggestions for making it better:
Thanks for otherwise great docs!