Closed cjyetman closed 2 years ago
On line 63, we force the sourcevar
vector to be in all caps to achieve case-insensitive matching, but that's with the assumption/insistence that all non-regex origin codes in codelist
are in all caps, because we don't enforce that in the code for countrycode
. All non-regex origin codes up until now have been in all caps, but the cctld origin code probably is more technically correct in lower case (though in most environments I've seen, host names are always converted to lowercase anyway).
One proposal would be to change line 201
matchidxs <- match(levels(sourcefctr), dict[[origin]])
to
matchidxs <- match(levels(sourcefctr), toupper(dict[[origin]]))
to ensure that the selected destination code is also converted to all caps when matching, though not modifying it otherwise so the return values remain in the same case as the original.
I've roughly tested this and it fixes this issue, but I haven't actually made the code change and run it through the testing system to see if it breaks anything else.
Eventually I may get around to making a PR, but feel free to implement this or a similar solution in the meantime if anyone else has the time.
That makes sense to me.
It is currently modelsummary
week, so I probably won't get around to this for a little while. But when I find the time I can give it a shot if you haven't already by that time.
BTW, do we want a pkgdown
website like the one I just built for modelsummary
?
wrt pkgdown
, I think that would be cool, and it's pretty easy to setup/manage
Turns out doing just upper
breaks some country name conversions, but we can work around this with a cctld
-specific if{}else{} logic.
Thanks for the report and solution.
First, in the help file for codelist, it lists "ccTLD" as an origin and destination code, however it is named "cctld" in the actual
codelist
object.Second, I haven't fully investigated this yet, but something is wrong with the matching on that code...
My first guess would be a
toupper
incountrycode
since the error message says it couldn't match ".DE", but I suppose it could be related to the "." in the string, but that shouldn't matter because it shouldn't be running regex on it, right?