Closed geotheory closed 2 years ago
Obviously not to be confused with Namibia..
NA
by itself in R reduces to a logical vector
class(NA)
#> [1] "logical"
If you use the more specific NA_character_
or include the NA
in a character vector, it works as expected
countrycode::countrycode(NA_character_, origin = 'iso2c', destination = 'country.name')
#> [1] NA
countrycode::countrycode(c(NA, "US"), origin = 'iso2c', destination = 'country.name')
#> [1] NA "United States"
I would argue that countrycode::countrycode(NA, origin = 'iso2c', destination = 'country.name')
should be an error, because NA
is not an expected vector to be passed as an argument.
Ah R's NA quirk strikes again. I hear what you're saying. No strong view though I suspect the approach will mean some some users encounter rare failures e.g. when an automated workflow encounters a single NA input (maybe parsed from JSON) - which in practice can be harder to debug in production.
Ah yeah, this is weird and annoying. I don't have an especially strong view either, but am inclined to defer to @cjyetman, as he always seems to fall on the "correct" side of these debates.
How about we just improve the error message to mention this corner case?
I have a strong feeling that it should be the responsibility of the web scraper (or whatever is generating the data) to return expected objects, e.g. if the web scraper is building a character vector because that's what it will usually find on the page, then it should deal with weird cases like no data or blanks or unfound elements etc. in an appropriate way, i.e. return a character vector with NA
if necessary.
That being said, I'm struggling to think of an edge case where countrycode()
auto returning NA
if sourcevar
is precisely NA
(logical and only one element). It does then open the door though for questions like:
sourcevar
argument to the expected class [I think this is a bad idea]NA
s with length > 1 [getting a bit dicier]NA
of a different class, e.g. a numeric NA
when a character vector is expected [also dicier]So, I don't see any harm in dealing with this very precise case in an automatic way, other than it possibly necessitates justification for why we don't deal with other similar but more complicated cases in an automatic way, which could then lead to creating and maintaining very sophisticated auto-fixing NA
problems code, all to to deal with something that really just shouldn't be happening.
Improved error message would be good - esp if it can be tailored to this failure case.
coercion in R is typically uni-directional, e.g. from "higher" classes to "lower" classes, but not the other way around
fe1c4a6c0a
Is it not desirable to accept NA values as inputs and simply return NAs for them? I think this would help users to avoid having to create a wrapper function.