MarkEdmondson1234 / searchConsoleR

R interface with Google Search Console API v3, including Search Analytics.
http://code.markedmondson.me/searchConsoleR/
Other
114 stars 41 forks source link

HTML 404 Error Making Batch API call #60

Open vanderWin opened 3 years ago

vanderWin commented 3 years ago

What goes wrong

Upgraded to R 4.0.2 and RStudio 1.3.1056 today and the scripts I've been using for a while are now producing different results.

When requesting data I only get 1-5 days worth before it hits an error. What's peculiar is that in the past an error like an expired token wouldn't try and bring back a HTML page as it's doing here.

Steps to reproduce the problem

Expected output

The usual data frames or a standardised error.

Actual output

Fetching search analytics for url: xxxxxxxxxx.com dates: 2019-10-04 2020-01-30 dimensions: date query page dimensionFilterExp:  searchType: web aggregationType: auto
Batching data via method: byDate
Will fetch up to 25000 rows per day
i 2020-07-16 17:20:03 > Batch API limited to [ 1 ] calls at once.
i 2020-07-16 17:20:03 > Request #:  2019-10-04
i 2020-07-16 17:20:03 > Token exists.
i 2020-07-16 17:20:03 > Constructing batch request URL for:  /webmasters/v3/sites/sc-domain%3Aroyalcanin.com/searchAnalytics/query
i 2020-07-16 17:20:03 > Making Batch API call
i 2020-07-16 17:20:41 > 500 type error in response
i 2020-07-16 17:20:55 > Request #:  2019-10-05
i 2020-07-16 17:20:55 > Token exists.
i 2020-07-16 17:20:55 > Constructing batch request URL for:  /webmasters/v3/sites/sc-domain%3Aroyalcanin.com/searchAnalytics/query
i 2020-07-16 17:20:55 > Making Batch API call
i 2020-07-16 17:20:56 > Request Status Code:  404
Error : lexical error: invalid char in json text.
                                       <!DOCTYPE html> <html lang=en> 
                     (right here) ------^

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 404 (Not Found)!!1</title>
  <style>
    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}
  </style>
  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
  <p><b>404.</b> <ins>That’s an error.</ins>
  <p>The requested URL <code>/batch/webmasters/v3</code> was not found on this server.  <ins>That’s all we know.</ins>
i 2020-07-16 17:20:56 > API returned error:  API error: returned web page that has been opened in your default browser if possible
i 2020-07-16 17:20:56 > No retry attempted:  API error: returned web page that has been opened in your default browser if possible
Error: Batch Request: 404 Not Found

Session Info


R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] beepr_1.3                 stringi_1.4.6             bigQueryR_0.5.0           tictoc_1.0                forcats_0.5.0             stringr_1.4.0             dplyr_1.0.0               purrr_0.3.4               readr_1.3.1               tidyr_1.1.0              
[11] tibble_3.0.3              ggplot2_3.3.2             tidyverse_1.3.0           searchConsoleR_0.4.0.9000 googleAuthR_1.3.0        

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5                lubridate_1.7.9           googleCloudStorageR_0.5.1 assertthat_0.2.1          digest_0.6.25             R6_2.4.1                  cellranger_1.1.0          backports_1.1.8           reprex_0.3.0              httr_1.4.1               
[11] pillar_1.4.6              rlang_0.4.7               curl_4.3                  readxl_1.3.1              rstudioapi_0.11           blob_1.2.1                munsell_0.5.0             tinytex_0.24              broom_0.7.0               compiler_4.0.2           
[21] modelr_0.1.8              xfun_0.15                 pkgconfig_2.0.3           askpass_1.1               openssl_1.4.2             tidyselect_1.1.0          audio_0.1-7               fansi_0.4.1               crayon_1.3.4              dbplyr_1.4.4             
[31] withr_2.2.0               grid_4.0.2                jsonlite_1.7.0            gtable_0.3.0              lifecycle_0.2.0           DBI_1.1.0                 magrittr_1.5              scales_1.1.1              zip_2.0.4                 cli_2.0.2                
[41] fs_1.4.2                  xml2_1.3.2                ellipsis_0.3.1            generics_0.0.2            vctrs_0.3.2               tools_4.0.2               glue_1.4.1                hms_0.5.3                 yaml_2.2.1                colorspace_1.4-1         
[51] gargle_0.5.0              rvest_0.3.5               memoise_1.1.0             haven_2.3.1              
MarkEdmondson1234 commented 3 years ago

I don’t think the update could have effected anything, it seems the API itself is being patchy since as you say the same request sometimes works. Perhaps if making a lot of requests the gap between requests should be increased, but not sure there is much I can change in the code to help. Can you see if other languages calling the api are also seeing an increase in 500 errors?


From: Alex notifications@github.com Sent: Thursday, July 16, 2020 6:39:27 PM To: MarkEdmondson1234/searchConsoleR searchConsoleR@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [MarkEdmondson1234/searchConsoleR] HTML 404 Error Making Batch API call (#60)

What goes wrong

Upgraded to R 4.0.2 and RStudio 1.3.1056 today and the scripts I've been using for a while are now producing different results.

When requesting data I only get 1-5 days worth before it hits an error. What's peculiar is that in the past an error like an expired token wouldn't try and bring back a HTML page as it's doing here.

Steps to reproduce the problem

Expected output

The usual data frames or a standardised error.

Actual output

Fetching search analytics for url: xxxxxxxxxx.com dates: 2019-10-04 2020-01-30 dimensions: date query page dimensionFilterExp: searchType: web aggregationType: auto

Batching data via method: byDate

Will fetch up to 25000 rows per day

i 2020-07-16 17:20:03 > Batch API limited to [ 1 ] calls at once.

i 2020-07-16 17:20:03 > Request #: 2019-10-04

i 2020-07-16 17:20:03 > Token exists.

i 2020-07-16 17:20:03 > Constructing batch request URL for: /webmasters/v3/sites/sc-domain%3Aroyalcanin.com/searchAnalytics/query

i 2020-07-16 17:20:03 > Making Batch API call

i 2020-07-16 17:20:41 > 500 type error in response

i 2020-07-16 17:20:55 > Request #: 2019-10-05

i 2020-07-16 17:20:55 > Token exists.

i 2020-07-16 17:20:55 > Constructing batch request URL for: /webmasters/v3/sites/sc-domain%3Aroyalcanin.com/searchAnalytics/query

i 2020-07-16 17:20:55 > Making Batch API call

i 2020-07-16 17:20:56 > Request Status Code: 404

Error : lexical error: invalid char in json text.

                                   <!DOCTYPE html> <html lang=en>

                 (right here) ------^

<!DOCTYPE html>

Error 404 (Not Found)!!1

404. That’s an error.

The requested URL /batch/webmasters/v3 was not found on this server. That’s all we know. i 2020-07-16 17:20:56 > API returned error: API error: returned web page that has been opened in your default browser if possible i 2020-07-16 17:20:56 > No retry attempted: API error: returned web page that has been opened in your default browser if possible Error: Batch Request: 404 Not Found Session Info R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] beepr_1.3 stringi_1.4.6 bigQueryR_0.5.0 tictoc_1.0 forcats_0.5.0 stringr_1.4.0 dplyr_1.0.0 purrr_0.3.4 readr_1.3.1 tidyr_1.1.0 [11] tibble_3.0.3 ggplot2_3.3.2 tidyverse_1.3.0 searchConsoleR_0.4.0.9000 googleAuthR_1.3.0 loaded via a namespace (and not attached): [1] Rcpp_1.0.5 lubridate_1.7.9 googleCloudStorageR_0.5.1 assertthat_0.2.1 digest_0.6.25 R6_2.4.1 cellranger_1.1.0 backports_1.1.8 reprex_0.3.0 httr_1.4.1 [11] pillar_1.4.6 rlang_0.4.7 curl_4.3 readxl_1.3.1 rstudioapi_0.11 blob_1.2.1 munsell_0.5.0 tinytex_0.24 broom_0.7.0 compiler_4.0.2 [21] modelr_0.1.8 xfun_0.15 pkgconfig_2.0.3 askpass_1.1 openssl_1.4.2 tidyselect_1.1.0 audio_0.1-7 fansi_0.4.1 crayon_1.3.4 dbplyr_1.4.4 [31] withr_2.2.0 grid_4.0.2 jsonlite_1.7.0 gtable_0.3.0 lifecycle_0.2.0 DBI_1.1.0 magrittr_1.5 scales_1.1.1 zip_2.0.4 cli_2.0.2 [41] fs_1.4.2 xml2_1.3.2 ellipsis_0.3.1 generics_0.0.2 vctrs_0.3.2 tools_4.0.2 glue_1.4.1 hms_0.5.3 yaml_2.2.1 colorspace_1.4-1 [51] gargle_0.5.0 rvest_0.3.5 memoise_1.1.0 haven_2.3.1 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

MarkEdmondson1234 commented 3 years ago

Are you also using your own client.id? I had a look at the default client.id to see if there was an uptick in 500 errors recently and although today there are none, it does show on July 1st briefly all requests were 500 errors - it looks like a new API version was rolled out at this time. So if your requests were around July 1st perhaps that is what you saw.

Screenshot 2020-07-17 at 10 47 52

If using your own client.id also check the quotas you have set up (https://console.cloud.google.com/apis/api/searchconsole.googleapis.com/quotas) - the settings for the default project is 100M queries per day and 2000 queries per 100 seconds per user.