Closed irmoodie closed 2 years ago
@irmoodie if you can re-install from Github, i've URL-encoded the query, so it should work now.
@eblondel I've re-installed the package from Github, but I still receive the same error using the code I posted above. I've also tried with a fresh R install on a Linux machine, however the same error is given if the search query contains a space. Let me know if there's something else I can try to troubleshoot this.
To reproduce:
install.packages("remotes")
remotes::install_github("eblondel/zen4R")
library(zen4R)
zenodo <- ZenodoManager$new()
my_zenodo_records <- zenodo$getRecords(q = "test search")
Returns:
Error: lexical error: invalid char in json text.
<html><body><h1>400 Bad request
(right here) ------^
Linux session info:
> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 LC_PAPER=C.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zen4R_0.6
loaded via a namespace (and not attached):
[1] httr_1.4.2 compiler_4.1.3 keyring_1.3.0 assertthat_0.2.1 R6_2.5.1 tools_4.1.3 curl_4.3.2 remotes_2.4.2 xml2_1.3.3
[10] jsonlite_1.8.0
Windows session info:
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zen4R_0.6
loaded via a namespace (and not attached):
[1] httr_1.4.2 compiler_4.1.2 keyring_1.3.0 assertthat_0.2.1 R6_2.5.1
[6] tools_4.1.2 curl_4.3.2 xml2_1.3.3 jsonlite_1.8.0
can you enable the logger with ZenodoManager, so we can see the request that is sent to Zenodo?
zenodo <- ZenodoManager$new(logger = "DEBUG")
@eblondel Sure, here's the output:
> zenodo <- ZenodoManager$new(logger = "DEBUG")
> my_zenodo_records <- zenodo$getRecords(q = "test search")
[zen4R][INFO] ZenodoRequest - Fetching https://zenodo.org/api/records/?q=test%20search&size=10&page=1
-> GET /api/records/?q=test%20search&size=10&page=1 HTTP/1.1
-> Host: zenodo.org
-> User-Agent: libcurl/7.64.1 r-curl/4.3.2 httr/1.4.2
-> Accept-Encoding: deflate, gzip
-> Accept: application/json, text/xml, application/xml, */*
-> Authorization: Bearer
->
<- HTTP/1.1 200 OK
<- Server: nginx
<- Date: Tue, 26 Apr 2022 14:34:03 GMT
<- Content-Type: application/json
<- Transfer-Encoding: chunked
<- Vary: Accept-Encoding
<- Link: <https://zenodo.org/api/records/?sort=bestmatch&q=test+search&page=1&size=10>; rel="self", <https://zenodo.org/api/records/?sort=bestmatch&q=test+search&page=2&size=10>; rel="next"
<- X-RateLimit-Limit: 60
<- X-RateLimit-Remaining: 59
<- X-RateLimit-Reset: 1650983704
<- Retry-After: 60
<- X-Frame-Options: sameorigin
<- X-XSS-Protection: 1; mode=block
<- X-Content-Type-Options: nosniff
<- Strict-Transport-Security: max-age=0
<- Referrer-Policy: strict-origin-when-cross-origin
<- Access-Control-Allow-Origin: *
<- Access-Control-Expose-Headers: Content-Type, ETag, Link, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
<- X-Request-ID: dfb8bc9b84a6a86f324649fd1bbdaa6b
<- Content-Encoding: gzip
<-
[zen4R][INFO] ZenodoManager - Successfully fetched list of published records - page 1
[zen4R][INFO] ZenodoRequest - Fetching https://zenodo.org/api/records/?q=test search&size=10&page=2
-> GET /api/records/?q=test search&size=10&page=2 HTTP/1.1
-> Host: zenodo.org
-> User-Agent: libcurl/7.64.1 r-curl/4.3.2 httr/1.4.2
-> Accept-Encoding: deflate, gzip
-> Accept: application/json, text/xml, application/xml, */*
-> Authorization: Bearer
->
<- HTTP/1.0 400 Bad request
<- Cache-Control: no-cache
<- Connection: close
<- Content-Type: text/html
<-
Error: lexical error: invalid char in json text.
<html><body><h1>400 Bad request
(right here) ------^
>
Ok i see, I forgot one url encoding when paging the getRecords. Re-install now, it should be ok this time :-)
@eblondel Solved! Thank you for your help and for the package!
you are welcome
Issue:
Using an ElasticSearch query with getRecords produces an error if the string contains a space.
How to reproduce:
Returns:
If I instead do a search without spaces (e.g. q = “test”) I get the expected results, a list of objects with type = ZenodoRecord. Is there something I'm missing here? Should the space be formatted differently?
I’ve included my session information below.
Thanks!