Closed hlapp closed 5 years ago
Yup, that's correct (https://github.com/ropensci/RNeXML/blob/master/R/get_characters.R#L115); sorry, I thought it was in the docs for get_characters
but it isn't. I better fix the docs
looks like this behavior is reasonable and documented, so closing now.
After an hour of hunting around and debugging it seems that the
get_characters()
method will use the IDs of the<otu/>
and<char/>
elements for the row and column labels of the matrix, respectively, if the respective labels are not unique. Is that correct? Here is an example NeXML file from the Phenoscape API.In principle this makes of course sense - R wants labels to be unique, or otherwise any kind of subsetting will yield confusing or undesired results. But I'm wondering whether this is documented somewhere prominently - it did hit me by surprise, and at first I thought upon seeing the IDs all over the place that something must have gone wrong. (And it might have on the data side - see phenoscape/phenoscape-kb-services#35 and phenoscape/phenoscape-kb-services#36 - but that's a different story.)