ropensci / comtradr

Functions for Interacting with the UN Comtrade API
https://docs.ropensci.org/comtradr
64 stars 17 forks source link

Error after registration of API token #40

Closed dakilian closed 1 year ago

dakilian commented 2 years ago

When I use the ct_search function after the regestration of an API token (function: ct_register_token() ), I get the error message "Error: Comtrade API request failed, with status code [500]". The same query is working fine, if there a) no token is registerd or b) the registered token is invalid.

Maybe it's the same problem as mentioned in #34 as "issue 2". I spent several houres on it, but could not fix it or at least understand it.

BTW: I also tried to use the API directly via httr. So far I did not run into any restrictions (max. number of elements, more than 100 queries/h) - wether I used an token or not in the URL. But the ct_search functions stops with an error if there were more than 100 queries...

It would be nice if someone could help me and fix this problem, so that I don't have to write a new query function myself. Thanks.

zlfccnu commented 2 years ago

I have also encoutered the same error, please anyone can help

ChrisMuir commented 2 years ago

Hello, I will be able to give this attention early next week.

To start, can you load the comtradr pkg using library(comtradr), then share the output of sessionInfo()?

zlfccnu commented 2 years ago

Hello, I will be able to give this attention early next week.

To start, can you load the comtradr pkg using library(comtradr), then share the output of sessionInfo()?

R version 4.2.0 (2022-04-22 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale: [1] LC_COLLATE=Chinese (Simplified)_China.utf8 [2] LC_CTYPE=Chinese (Simplified)_China.utf8
[3] LC_MONETARY=Chinese (Simplified)_China.utf8 [4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.utf8

attached base packages: [1] parallel stats graphics grDevices utils datasets [7] methods base

other attached packages: [1] ConnectednessApproach_1.0.0 RSiena_1.3.0.1
[3] multinet_4.0.1 RColorBrewer_1.1-3
[5] openxlsx_4.2.5 Spillover_0.1.0.1
[7] fastSOM_1.0.1 vars_1.5-6
[9] lmtest_0.9-40 urca_1.3-0
[11] strucchange_1.5-2 sandwich_3.0-1
[13] countrycode_1.4.0 archive_1.1.4
[15] haven_2.5.0 vroom_1.5.7
[17] comtradr_0.3.0 jsonlite_1.8.0
[19] echarts4r_0.4.3 patchwork_1.1.1
[21] econophysics_0.2.8 furrr_0.2.3
[23] future_1.25.0 DEoptim_2.2-6
[25] PortfolioAnalytics_1.1.0 PerformanceAnalytics_2.0.4 [27] foreach_1.5.2 xts_0.12.1
[29] zoo_1.8-10 Rlibeemd_1.4.2
[31] kernlab_0.9-30 moments_0.14
[33] fields_13.3 viridis_0.6.2
[35] viridisLite_0.4.0 spam_2.8-0
[37] quadprog_1.5-8 MASS_7.3-56
[39] RcppEigen_0.3.3.9.2 igraph_1.3.1
[41] BH_1.78.0-0 Rcpp_1.0.8.3
[43] data.table_1.14.2 forcats_0.5.1
[45] stringr_1.4.0 dplyr_1.0.9
[47] purrr_0.3.4 readr_2.1.2
[49] tidyr_1.2.0 tibble_3.1.6
[51] ggplot2_3.3.6 tidyverse_1.3.1
[53] tidygraph_1.2.1 pacman_0.5.1

loaded via a namespace (and not attached): [1] readxl_1.4.0 backports_1.4.1
[3] splines_4.2.0 gmp_0.6-5
[5] listenv_0.8.0 digest_0.6.29
[7] htmltools_0.5.2 rmgarch_1.3-9
[9] fansi_1.0.3 Rsolnp_1.16
[11] magrittr_2.0.3 ks_1.13.5
[13] tzdb_0.3.0 globals_0.14.0
[15] modelr_0.1.8 prettyunits_1.1.1
[17] colorspace_2.0-3 rvest_1.0.2
[19] xfun_0.31 crayon_1.5.1
[21] survival_3.3-1 iterators_1.0.14
[23] glue_1.6.2 gtable_0.3.0
[25] MatrixModels_0.5-0 car_3.0-13
[27] Rmpfr_0.8-7 shape_1.4.6
[29] maps_3.4.0 SparseM_1.81
[31] abind_1.4-5 scales_1.2.0
[33] mvtnorm_1.1-3 DBI_1.1.2
[35] GeneralizedHyperbolic_0.8-4 xtable_1.8-4
[37] progress_1.2.2 mclust_5.4.10
[39] spd_2.0-1 bit_4.0.4
[41] dotCall64_1.0-1 truncnorm_1.0-8
[43] glmnet_4.1-4 htmlwidgets_1.5.4
[45] httr_1.4.3 SkewHyperbolic_0.4-0
[47] ellipsis_0.3.2 ff_4.0.7
[49] pkgconfig_2.0.3 dbplyr_2.1.1
[51] utf8_1.2.2 tidyselect_1.1.2
[53] rlang_1.0.2 later_1.3.0
[55] munsell_0.5.0 cellranger_1.1.0
[57] tools_4.2.0 cli_3.3.0
[59] generics_0.1.2 frequencyConnectedness_0.2.3 [61] broom_0.8.0 fastmap_1.1.0
[63] knitr_1.39 bit64_4.0.5
[65] fs_1.5.2 zip_2.2.0
[67] pbapply_1.5-0 nlme_3.1-157
[69] quantreg_5.93 mime_0.12
[71] pracma_2.3.8 xml2_1.3.3
[73] Bessel_0.6-0 compiler_4.2.0
[75] rstudioapi_0.13 curl_4.3.2
[77] reprex_2.0.1 pcaPP_2.0-1
[79] stringi_1.7.6 lattice_0.20-45
[81] Matrix_1.4-1 vctrs_0.4.1
[83] pillar_1.7.0 lifecycle_1.0.1
[85] rugarch_1.4-8 corpcor_1.6.10
[87] httpuv_1.6.5 R6_2.5.1
[89] onewaytests_2.6 promises_1.2.0.1
[91] KernSmooth_2.23-20 gridExtra_2.3
[93] parallelly_1.31.1 codetools_0.2-18
[95] assertthat_0.2.1 nortest_1.0-4
[97] withr_2.5.0 hms_1.1.1
[99] grid_4.2.0 WeightedPortTest_1.0
[101] carData_3.0-5 DistributionUtils_0.6-0
[103] numDeriv_2016.8-1.1 shiny_1.7.1
[105] lubridate_1.8.0

Above is my sessioninfo. I have been working around for a while to check the possible error. It seems that the error code 500 is related to the potential worng api call for some parameter combination when using token. May the server side problem? Thanks @ChrisMuir

ChrisMuir commented 2 years ago

Thanks a bunch @zlfccnu . Yeah incorrect url parameter formatting was one potential that occurred to me. I'll be traveling tomorrow thru Sunday, I hope to jump on this Monday or Tuesday. Thanks.

zlfccnu commented 2 years ago

Really appreciate your help. If it is necessary, I can share my token.

dakilian commented 2 years ago

Thanks @ChrisMuir for taking on this issue! Below you will find also my sessionInfo() output.

sessionInfo() R version 4.1.2 (2021-11-01) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.8 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0
[7] tibble_3.1.6 ggplot2_3.3.5 tidyverse_1.3.1 here_1.0.1 openxlsx_4.2.5 tradestatistics_4.2 [13] comtradr_0.3.0.09000

loaded via a namespace (and not attached): [1] Rcpp_1.0.8 lubridate_1.8.0 arrow_8.0.0 assertthat_0.2.1 rprojroot_2.0.2 digest_0.6.29 utf8_1.2.2 R6_2.5.1
[9] cellranger_1.1.0 backports_1.4.1 reprex_2.0.1 httr_1.4.3 pillar_1.7.0 rlang_1.0.2 curl_4.3.2 readxl_1.3.1
[17] rstudioapi_0.13 data.table_1.14.2 bit_4.0.4 munsell_0.5.0 broom_0.7.12 compiler_4.1.2 modelr_0.1.8 pkgconfig_2.0.3
[25] tidyselect_1.1.2 httpcode_0.3.0 fansi_1.0.2 crayon_1.5.0 tzdb_0.2.0 dbplyr_2.1.1 withr_2.5.0 crul_1.2.0
[33] grid_4.1.2 jsonlite_1.8.0 gtable_0.3.0 lifecycle_1.0.1 DBI_1.1.2 magrittr_2.0.2 scales_1.1.1 zip_2.2.0
[41] cli_3.2.0 stringi_1.7.6 cachem_1.0.6 fs_1.5.2 xml2_1.3.3 ellipsis_0.3.2 generics_0.1.2 vctrs_0.3.8
[49] tools_4.1.2 bit64_4.0.5 glue_1.6.2 hms_1.1.1 fastmap_1.1.0 colorspace_2.0-3 rvest_1.0.2 memoise_2.0.1
[57] haven_2.4.3

morrisseyj commented 2 years ago

I am having the same problem. Posting session info in the hope of confirming the issue.

sessionInfo()


R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4 readr_2.1.1 tidyr_1.1.4 tibble_3.1.6 ggplot2_3.3.5 tidyverse_1.3.1 comtradr_0.3.0

loaded via a namespace (and not attached): [1] Rcpp_1.0.7 cellranger_1.1.0 pillar_1.6.4 compiler_4.1.2 dbplyr_2.1.1 tools_4.1.2 jsonlite_1.7.2 lubridate_1.8.0 lifecycle_1.0.1 gtable_0.3.0
[11] pkgconfig_2.0.3 rlang_0.4.12 reprex_2.0.1 cli_3.1.0 rstudioapi_0.13 DBI_1.1.1 curl_4.3.2 haven_2.4.3 xml2_1.3.3 withr_2.4.3
[21] httr_1.4.2 fs_1.5.1 generics_0.1.1 vctrs_0.3.8 hms_1.1.1 grid_4.1.2 tidyselect_1.1.1 glue_1.5.1 R6_2.5.1 fansi_0.5.0
[31] readxl_1.3.1 tzdb_0.2.0 modelr_0.1.8 magrittr_2.0.1 backports_1.3.0 scales_1.1.1 ellipsis_0.3.2 rvest_1.0.2 assertthat_0.2.1 colorspace_2.0-2 [41] utf8_1.2.2 stringi_1.7.6 munsell_0.5.0 broom_0.7.10 crayon_1.4.2

bisgr commented 2 years ago

I have the same problem by using the ct_search function after the regestration of an API token. Below you will find also my sessionInfo() output:

library(comtradr) sessionInfo() R version 4.2.0 (2022-04-22 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale: [1] LC_COLLATE=English_United Kingdom.utf8 LC_CTYPE=English_United Kingdom.utf8 LC_MONETARY=English_United Kingdom.utf8 [4] LC_NUMERIC=C LC_TIME=English_United Kingdom.utf8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] comtradr_0.3.0.09000

loaded via a namespace (and not attached): [1] magrittr_2.0.3 usethis_2.1.6 devtools_2.4.3 pkgload_1.2.4 R6_2.5.1 rlang_1.0.2 fastmap_1.1.0
[8] httr_1.4.3 tools_4.2.0 pkgbuild_1.3.1 sessioninfo_1.2.2 cli_3.3.0 withr_2.5.0 ellipsis_0.3.2
[15] remotes_2.4.2 rprojroot_2.0.3 lifecycle_1.0.1 crayon_1.5.1 brio_1.1.3 processx_3.5.3 purrr_0.3.4
[22] callr_3.7.0 fs_1.5.2 ps_1.7.0 curl_4.3.2 testthat_3.1.4 memoise_2.0.1 glue_1.6.2
[29] cachem_1.0.6 compiler_4.2.0 desc_1.4.1 prettyunits_1.1.1 jsonlite_1.8.0

ChrisMuir commented 2 years ago

@zlfccnu Can you shoot me an email? My email address is listed in the DESCRIPTION file of this repo.

zlfccnu commented 2 years ago

@ChrisMuir token to you email, pls check it

ChrisMuir commented 2 years ago

Update on this, I've identified the issue, the max url param in the pkg code is getting set to 250000 when there's a valid token that's registered during the R session. Per the Comtrade API docs, the ceiling on the max param is 250000, but the API call is limited by the max associated to the token that's provided. The API will return 500's if the max param is set too high. Once I set the max param back down to 100000, it's working with a valid token.

I'll be pushing an update to the repo tomorrow, and will hopefully get a release out to CRAN this week.

Thank you to @zlfccnu for the help with this.

dakilian commented 2 years ago

Thanks @ChrisMuir for fixing the issue. I'am looking forward for the update of the repo.

ChrisMuir commented 2 years ago

Ok just pushed what I believe is the fix for this, in commit 9146cf2. I'm still seeing intermittent 500's from ct_search(), but they don't seem to be related to adding a token url parameter. Just seem to be getting more 500's than I remember from the Comtrade API.

@zlfccnu @dakilian @morrisseyj @bisgr If you all wouldn't mind, pulling the latest DEV version from this repo, kicking the tires and reporting back how it works for you, I'd really appreciate it. Curious to see if you're able to make API calls using a token, and if you still get some intermittent 500's. Thanks all!

zlfccnu commented 2 years ago

@ChrisMuir hi, thanks for you effort, But I still occassionally hit the 500 codes: image

ct_search(reporters ="Angola",partners="all", trade_direction ="all",freq = "annual",start_date = "2007",end_date = "2011",commod_codes = codes) the codes are codes=c(solar = "854140",hydro="8410",wind ="850231", biomass = "440130",coal = "2701", oil = "2709", lng ="271111",ng="271121")

after trying the above code for nearly 10 times, I get the result image

I have manually reinstall the pkg with devtools.

ChrisMuir commented 2 years ago

Hey, yeah that's similar to what I'm seeing. I've tested in a bunch of different R sessions, restarting each time. Each time, I run a ct_search() about 10 times. Sometimes they all run successfully with no errors, sometimes the first few will return 500's. It does feel like it's only the first 2-6 API calls that fail though, and once one is successful, then they all will start returning 200's. I don't know, I'm curious to see if anyone else is able to test this.

zlfccnu commented 2 years ago

Hey, yeah that's similar to what I'm seeing. I've tested in a bunch of different R sessions, restarting each time. Each time, I run a ct_search() about 10 times. Sometimes they all run successfully with no errors, sometimes the first few will return 500's. It does feel like it's only the first 2-6 API calls that fail though, and once one is successful, then they all will start returning 200's. I don't know, I'm curious to see if anyone else is able to test this.

yes, what I am seeing now on my side is almost the same, but I have not see the 200's now.

bisgr commented 2 years ago

Thanks @ChrisMuir for fixing the issue. I run the following:

q=ct_search(reporters = "all", partners = "all", trade_direction = "imports", start_date = 2019, end_date = 2019, commod_codes="1005")

I took 500's in 3 out of 10 times that I run the same command (in the rest 7 I took normal results).

dakilian commented 2 years ago

I played around with it a bit and was able to replicate the observations of @ChrisMuir and @bisgr. Sending a single query with ct_search() returns the 500 error more often than it succeeds. When I run a for-loop with the queries, it also fails frequently at the beginning. Mostly right at the very first query. But if the query is successful, more than 100 further queries are also carried out successfully without the loop being interrupted by a new error.

ChrisMuir commented 2 years ago

When I run a for-loop with the queries, it also fails frequently at the beginning. Mostly right at the very first query. But if the query is successful, more than 100 further queries are also carried out successfully without the loop being interrupted by a new error.

Yeah this is exactly what I'm seeing. With each new session, it's like it has to get through N number of 500's, then once it finally returns a 200, all subsequent calls return 200.

I don't quite know what to make of it.

dakilian commented 2 years ago

Thanks to @ChrisMuir for the support. However, the fact that get_comtrade returns an error is annoying. Two thoughts on this.

  1. If I understand the API documentation correctly, I am already authorised via my IP address because of the institutional access to comtrade. Registering a token is only necessary if I don't want to access with the institutional IP address (see API documentation). However, comtadr only allows 100 queries if no token is registered. Since the problems described above only occur after the token has been registered, it would be helpful, to be able to remove the restriction implemented in the comtradr package without registering a token.

  2. It is not nice - I know - but if there is no solution for the API error: What about repeating the query several times first if the 500 error occurs, before interrupting get_comtrade with an error?

morrisseyj commented 2 years ago

Thanks for your work on this @ChrisMuir. I ended up needing the data and so wrote an implementation of the API, not nicely wrapped as you have it here. To that end I have yet to download the updated version you submitted and try with that. However on the 500 error, I am seeing the same thing calling the API outside of the comtradr wrapper. I can generate a URL, and have it return data and then have the exact same URL return a 500 error one second later.

Look forward to testing the contradr package soon, but thought i would add this context.

morrisseyj commented 2 years ago

Hi all, i have been in touch with the people at COMTRADE regarding the 500 error. They indicated that its a result of the server being overloaded. I asked what they thought of calling the API repeatedly when receiving a 500 error, until a valid result came through - this is obviously a problem as it further stresses the server. He indicated to me that they have forwarded this question onto the IT technical team, and asked for a couple of days for them to resolve. Happy to keep people updated.

ChrisMuir commented 2 years ago

Oh wow, that's great that you heard back from them @morrisseyj , I've been submitting the issue via their bug tracker intake form, but haven't heard anything back.

To follow up on this, we've established that the issue is with the underlying API. I am 100% not inclined to add logic into this package to retry when it gets a 5XX response back. In my view, it's on the end user to create their own retry logic, if they choose to do so. I cannot account for the Comtrade API being buggy or having issues, all I can do is report those bugs to the UN Comtrade team when we know about them.

morrisseyj commented 2 years ago

@ChrisMuir agree with you on not trying to fix the server side issues at comtrade within the wrapper for the API. Thanks again for your work on this.

Some bad news from the people at comtrade. This is the response i got from the person who passed my issue to IT:

Quote:” not really sure what have changed in Windows servers lately that we have received many mails with this issue, and it only comes from premium users that are using a token in the API calls. while googling I found some responses where some people say there is a memory leak in the application but it is really strange since sometimes just ‘recycling’ the app pool in IIS solved the issue. I configured an automated recycling of the app pool for the API everyday around midnight NY time to reduce the number of these errors. Nothing has changed on the API since last year and it just happened that for the past 2-3 months this issue happened with premium users and they get fixed if we remove the token from the call and use IP authentication. We can discuss further if needed, but didn’t want to spend more time on this since we are moving to the new API.”, unquote.

Essentially they don't know what is going on. Also, looks like the new service is moving here: https://comtradeplus.azurewebsites.net/

I haven't looked closely enough to get a sense of whether this will mean the wrapper needs to be re-written.

ChrisMuir commented 2 years ago

Wow, thank you so much for passing the info along. Def not what we want to hear hah. I was unaware of the fact that they're planning to move the API. I will keep an eye on that, I'm sure that the API contract will end up changing some. Thanks again!

dakilian commented 2 years ago

@morrisseyj thank you for contacting comtrade. If their server is overloaded, there's no point in bombarding it with requests, that's for sure. They also wrote that the problem only occurs when using a token and can be solved with IP-based authentication. If I have understood it correctly, it is currently not possible to use the comtradrpackage for this, as the number of queries is limited if no token has been registered (what I have already explained above). What do you think of the idea of implementing a function that makes it possible to receive more than 100 responses per hour without a registered token? Thanks!

morrisseyj commented 2 years ago

@dakilian Perhaps I misunderstand your suggestion, but I don't believe that there is anything that can be done on the CRAN package side of this. This is a usage limit implemented by COMTRADE (see here (under Usage limits): https://comtrade.un.org/data/doc/api). No function can get around this (other than spoofing IPs or something which wouldn't be recommended). Let me know if I have misunderstood your suggestion.

dakilian commented 2 years ago

The usage limit of comtrade applies only to "guest". Authenticated users can do 10k requestes per hour. Authentication can be done by IP adress or by usage of a token (see here under Authentication: https://comtrade.un.org/data/doc/api). Comtrade writes in the API documentation "If you are an authenticated user and are accessing the API via an IP address listed on your account, your account should be recognized (and there is no need to use the authorization code method below)." Unfortunately this isn´t possible with the comtradr package, because you run in error of the package if you want to do more than 100 requests/h and there is no token registered. Due to the current bug with the comtrade servers, which only occurs if a token is used in the call, I reset in my script the queries_this_hour parameter in the ct_env of the comtradr package manually. The workaround enabels me to do more than 100 requests with the comtradr package with IP authentication. @morrisseyj and @ChrisMuir: My suggestion would therefore be to extend the function ct_register_token so that you can select IP authetication as an option and so creating a way to increase the limit in the package without using a token in the call.

datapumpernickel commented 1 year ago

Closing this issue, we are adressing changes to the API and relaunch in #45