vincentarelbundock / countrycode

R package: Convert country names and country codes. Assigns region descriptors.
https://vincentarelbundock.github.io/countrycode
GNU General Public License v3.0
342 stars 83 forks source link

FAO: China vs. China mainland #266

Closed vincentarelbundock closed 2 years ago

vincentarelbundock commented 3 years ago

Potential issue reported via email:

I think I have spotted a problem: I have been working with a database downloaded from FAOSTATS and China is divided into China (code 351) as the sum of China, mainland + Taiwan + Macao + Hong Kong. (codes 96, 128, 41 and 214). When including a iso2c into the database using contrycode with the option origin = “fao”, the aggregate China (code 351) gets the “CN” while China, mainland gets a “NA”, and Taiwan, Macao and Hong Kong gets their respective iso2c. Therefore, when summing up the values are higher since the China (code 351) already takes into consideration those values of Taiwan, Macao and Hong Kong. The solution would be to assign the iso2c = “CN” to China, mainland (code 41) instead of China (code 351).

Given that the main purpose of countrycode is to convert from one code to another, I think the most salient question is:

Do the majority of other codes us "China/CHN/etc." to refer to mainland China only, or to some kind of aggregate?

My impressionistic sense is that it is mainland China. In that case, we should make the change suggested by the emailer. This would also be consistent with us having separate regexes for Taiwan and Hong Kong, for example.