Closed Snehal-Rajwar closed 7 months ago
Thank you for opening this issue @Snehal-Rajwar
1) What dataset were you trying to label exactly? I assumed it was nrg_cb_oil
and tried to replicate your error - everything seemed to work fine for me:
> nrg <- get_eurostat("nrg_cb_oil")
trying URL 'https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/nrg_cb_oil?format=TSV&compressed=true'
downloaded 3.6 MB
Table nrg_cb_oil cached at /var/folders/f4/h_r3y60n0nn0qm6qx5hnx1s00000gn/T//Rtmp3ypzad/eurostat/4ef7e150263290f09e117c721782a317.rds
> nrg_l <- label_eurostat(nrg)
All countries (geo codes) labeled correctly:
> unique(nrg_l$geo)
[1] "Albania"
[2] "Austria"
[3] "Bosnia and Herzegovina"
[4] "Belgium"
[5] "Bulgaria"
[6] "Cyprus"
[7] "Czechia"
[8] "Germany"
[9] "Denmark"
[10] "Euro area – 20 countries (from 2023)"
[11] "Estonia"
[12] "Greece"
[13] "Spain"
[14] "European Union - 27 countries (from 2020)"
[15] "Finland"
[16] "France"
[17] "Georgia"
[18] "Croatia"
[19] "Hungary"
[20] "Ireland"
[21] "Iceland"
[22] "Italy"
[23] "Liechtenstein"
[24] "Lithuania"
[25] "Luxembourg"
[26] "Latvia"
[27] "Moldova"
[28] "Montenegro"
[29] "North Macedonia"
[30] "Malta"
[31] "Netherlands"
[32] "Norway"
[33] "Poland"
[34] "Portugal"
[35] "Romania"
[36] "Serbia"
[37] "Sweden"
[38] "Slovenia"
[39] "Slovakia"
[40] "Türkiye"
[41] "Ukraine"
[42] "United Kingdom"
[43] "Kosovo*"
2) Are you running 4.0.0 version of the package? Could you post your sessionInfo()
?
These are the packages the columns with labelling issue are partner, nrg_bal. Yes i am running 4.0.0. Let me know what you see ,if it the problem with the package
I tried downloading and labelling the first 2 datasets from your example and encountered no problems, warning messages or errors:
> nrg_cb_oil <- get_eurostat("nrg_cb_oilm")
trying URL 'https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/nrg_cb_oilm?format=TSV&compressed=true'
downloaded 2.0 MB
Table nrg_cb_oilm cached at /var/folders/f4/h_r3y60n0nn0qm6qx5hnx1s00000gn/T//Rtmpp1ZfUP/eurostat/66a8a5a0de29b6c28bc55a6fa8718dc5.rds
> nrg_cb_oil_l <- label_eurostat(nrg_cb_oil)
> nrg_cb_stk <- get_eurostat("nrg_stk_oilm")
trying URL 'https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/nrg_stk_oilm?format=TSV&compressed=true'
downloaded 3.5 MB
Table nrg_stk_oilm cached at /var/folders/f4/h_r3y60n0nn0qm6qx5hnx1s00000gn/T//Rtmpp1ZfUP/eurostat/fe4d35a1fe401f31c894e6af39de2f4d.rds
> nrg_cb_stk_l <- label_eurostat(nrg_cb_stk)
When comparing our sessionInfo I notice that you are running slightly older versions of R packages and 2 years older version or R. Here's my sessionInfo for reference:
> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Helsinki
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] eurostat_4.0.0
loaded via a namespace (and not attached):
[1] tidyr_1.3.0 rappdirs_0.3.3 utf8_1.2.3 generics_0.1.3
[5] class_7.3-22 xml2_1.3.6 KernSmooth_2.23-22 stringi_1.7.12
[9] hms_1.1.3 digest_0.6.33 magrittr_2.0.3 countrycode_1.5.0
[13] timechange_0.2.0 ISOweek_0.6-2 cellranger_1.1.0 rprojroot_2.0.3
[17] plyr_1.8.9 jsonlite_1.8.7 e1071_1.7-13 backports_1.4.1
[21] httr_1.4.7 purrr_1.0.2 fansi_1.0.5 regions_0.1.8
[25] bibtex_0.5.1 httr2_0.2.3 cli_3.6.2 rlang_1.1.3
[29] tools_4.3.2 tzdb_0.4.0 dplyr_1.1.3 here_1.0.1
[33] curl_5.2.0 assertthat_0.2.1 vctrs_0.6.4 R6_2.5.1
[37] proxy_0.4-27 lifecycle_1.0.3 lubridate_1.9.3 classInt_0.4-10
[41] RefManageR_1.4.0 stringr_1.5.0 pkgconfig_2.0.3 pillar_1.9.0
[45] data.table_1.14.10 glue_1.6.2 Rcpp_1.0.11 tibble_3.2.1
[49] tidyselect_1.2.0 rstudioapi_0.15.0 readr_2.1.4 compiler_4.3.2
[53] readxl_1.4.3
From this information I can't say what the point of failure is when you're attempting to label things. Generally speaking, and I'm not saying that my packages are as up-to-date as they can but still, it may help to update your packages and / or R.
Hi Thanks ,I didn't realise my computer was updating it once i downloaded it .It did solve it for the first two datasets but I am still having labelling problem with others for partner column. Can you check those datasets as well
my current session info Really appreciate the help. Thanks!
Thank you for the update. I tried debugging the code and to me it seems that the only labels that the function is not able to label are "NA" items. Internally unlabelled codes are saves in variable x
and labelled titles are saved in variable y
and from that we get (while in debug(label_eurostat)
:
# The positions of NA items in y
head(which(is.na(y)))
[1] 318979 318980 318981 318982 318983 318984
# The number of NA items (length of which(is.na(y)) )
length(which(is.na(y)))
[1] 37665
# the contents of x indexes where y is NA
head(x[which(is.na(y))])
[1] "NA" "NA" "NA" "NA" "NA" "NA"
# unique items
unique(x[which(is.na(y))])
[1] "NA"
So to me it seems that while the warning message may be a bit alarming, the function mainly works as it should.
> unique(nrg_ti_trade$partner)
[1] "AD" "AE" "AFR_OTH" "AL"
[5] "AM" "AME_LAT" "AME_OTH" "AN"
[9] "AO" "AR" "ASI_NME" "ASI_NME_OTH"
[13] "ASI_OTH" "AT" "AU" "AW"
[17] "AZ" "BA" "BB" "BD"
[21] "BE" "BG" "BH" "BJ"
[25] "BN" "BO" "BR" "BS"
[29] "BY" "BZ" "CA" "CD"
[33] "CG" "CH" "CI" "CL"
[37] "CM" "CN" "CN_X_HK" "CO"
[41] "CR" "CU" "CV" "CW"
[45] "CY" "CZ" "DE" "DJ"
[49] "DK" "DO" "DZ" "EC"
[53] "EE" "EG" "EL" "ER"
[57] "ES" "ET" "EU27_2020" "EU28"
[61] "EUR_OTH" "EX_SU_OTH" "FI" "FR"
[65] "GA" "GE" "GH" "GI"
[69] "GQ" "GT" "GW" "HK"
[73] "HN" "HR" "HU" "ID"
[77] "IE" "IL" "IN" "IQ"
[81] "IR" "IS" "IT" "JM"
[85] "JO" "JP" "KE" "KG"
[89] "KH" "KP" "KR" "KW"
[93] "KZ" "LA" "LB" "LI"
[97] "LK" "LR" "LT" "LU"
[101] "LV" "LY" "MA" "MD"
[105] "ME" "MG" "MH" "MK"
[109] "MM" "MN" "MR" "MT"
[113] "MU" "MX" "MY" "MZ"
[117] "NA" "NC" "NE" "NG"
[121] "NL" "NO" "NP" "NSP"
[125] "NZ" "OM" "PA" "PE"
[129] "PG" "PH" "PK" "PL"
[133] "PT" "QA" "RO" "RS"
[137] "RU" "SA" "SD" "SE"
[141] "SG" "SI" "SK" "SL"
[145] "SN" "SS" "ST" "SY"
[149] "TG" "TH" "TJ" "TL"
[153] "TM" "TN" "TOTAL" "TR"
[157] "TT" "TW" "TZ" "UA"
[161] "UG" "UK" "US" "UY"
[165] "UZ" "VE" "VG" "VN"
[169] "XK" "YE" "ZA" "EX_YU_OTH"
> unique(nrg_ti_trade_l$partner)
[1] "Andorra"
[2] "United Arab Emirates"
[3] "Other African countries (aggregate changing according to the context)"
[4] "Albania"
[5] "Armenia"
[6] "Latin American countries"
[7] "Other American countries (aggregate changing according to the context)"
[8] "Netherlands Antilles"
[9] "Angola"
[10] "Argentina"
[11] "Near and Middle East Asia (aggregate changing according to the context)"
[12] "Other Near and Middle East Asian countries"
[13] "Other Asian countries (aggregate changing according to the context)"
[14] "Austria"
[15] "Australia"
[16] "Aruba"
[17] "Azerbaijan"
[18] "Bosnia and Herzegovina"
[19] "Barbados"
[20] "Bangladesh"
[21] "Belgium"
[22] "Bulgaria"
[23] "Bahrain"
[24] "Benin"
[25] "Brunei Darussalam"
[26] "Bolivia"
[27] "Brazil"
[28] "Bahamas"
[29] "Belarus"
[30] "Belize"
[31] "Canada"
[32] "Democratic Republic of the Congo"
[33] "Congo"
[34] "Switzerland"
[35] "Côte d’Ivoire"
[36] "Chile"
[37] "Cameroon"
[38] "China"
[39] "China except Hong Kong"
[40] "Colombia"
[41] "Costa Rica"
[42] "Cuba"
[43] "Cabo Verde"
[44] "Curaçao"
[45] "Cyprus"
[46] "Czechia"
[47] "Germany"
[48] "Djibouti"
[49] "Denmark"
[50] "Dominican Republic"
[51] "Algeria"
[52] "Ecuador"
[53] "Estonia"
[54] "Egypt"
[55] "Greece"
[56] "Eritrea"
[57] "Spain"
[58] "Ethiopia"
[59] "European Union - 27 countries (from 2020)"
[60] "European Union - 28 countries (2013-2020)"
[61] "Other European countries (aggregate changing according to the context)"
[62] "Other countries of former Soviet Union (before 1991)"
[63] "Finland"
[64] "France"
[65] "Gabon"
[66] "Georgia"
[67] "Ghana"
[68] "Gibraltar"
[69] "Equatorial Guinea"
[70] "Guatemala"
[71] "Guinea-Bissau"
[72] "Hong Kong"
[73] "Honduras"
[74] "Croatia"
[75] "Hungary"
[76] "Indonesia"
[77] "Ireland"
[78] "Israel"
[79] "India"
[80] "Iraq"
[81] "Iran"
[82] "Iceland"
[83] "Italy"
[84] "Jamaica"
[85] "Jordan"
[86] "Japan"
[87] "Kenya"
[88] "Kyrgyzstan"
[89] "Cambodia"
[90] "North Korea"
[91] "South Korea"
[92] "Kuwait"
[93] "Kazakhstan"
[94] "Laos"
[95] "Lebanon"
[96] "Liechtenstein"
[97] "Sri Lanka"
[98] "Liberia"
[99] "Lithuania"
[100] "Luxembourg"
[101] "Latvia"
[102] "Libya"
[103] "Morocco"
[104] "Moldova"
[105] "Montenegro"
[106] "Madagascar"
[107] "Marshall Islands"
[108] "North Macedonia"
[109] "Myanmar/Burma"
[110] "Mongolia"
[111] "Mauritania"
[112] "Malta"
[113] "Mauritius"
[114] "Mexico"
[115] "Malaysia"
[116] "Mozambique"
[117] NA
[118] "New Caledonia"
[119] "Niger"
[120] "Nigeria"
[121] "Netherlands"
[122] "Norway"
[123] "Nepal"
[124] "Not specified"
[125] "New Zealand"
[126] "Oman"
[127] "Panama"
[128] "Peru"
[129] "Papua New Guinea"
[130] "Philippines"
[131] "Pakistan"
[132] "Poland"
[133] "Portugal"
[134] "Qatar"
[135] "Romania"
[136] "Serbia"
[137] "Russia"
[138] "Saudi Arabia"
[139] "Sudan"
[140] "Sweden"
[141] "Singapore"
[142] "Slovenia"
[143] "Slovakia"
[144] "Sierra Leone"
[145] "Senegal"
[146] "South Sudan"
[147] "São Tomé and Príncipe"
[148] "Syria"
[149] "Togo"
[150] "Thailand"
[151] "Tajikistan"
[152] "Timor-Leste"
[153] "Turkmenistan"
[154] "Tunisia"
[155] "Total"
[156] "Türkiye"
[157] "Trinidad and Tobago"
[158] "Taiwan"
[159] "Tanzania"
[160] "Ukraine"
[161] "Uganda"
[162] "United Kingdom"
[163] "United States"
[164] "Uruguay"
[165] "Uzbekistan"
[166] "Venezuela"
[167] "British Virgin Islands"
[168] "Viet Nam"
[169] "Kosovo*"
[170] "Yemen"
[171] "South Africa"
[172] "Other countries of former Yugoslavia (before 1992)"
Thank you that worked fine.Also thanks for the help and more info on the package.
I have been going in circles about it, the new package has been continuously having issues with labelling . Multiple columns such as partner,nrg_bal etc even geo that is location seem to not find all the matches . It's breaking the existing codes and sequences extensively. I was wondering if there would be any resolutions about this particular function soon. Let me know if I am missing somethin g from my end, but i believe its just a simple function that should not cause the error.