paleolimbot / rbbt

R Interface to the Better BiBTex Zotero Connector
142 stars 25 forks source link

bbt_write_bib() throws "Error: not found" when duplicate key exists in Zotero #11

Closed florisvdh closed 3 years ago

florisvdh commented 3 years ago

UPDATE: for this issue, please go to the next message in this thread, as the below error 'does not exist in current working directory' has moved to #18

I'm just trying rbbt package for the first time, so I report some things that I came by. Thanks for creating this interesting and promising package!

The documentation of bbt_write_bib() says "Write a bibliography file from Zotero citation keys", so I thought this should also create the file if non-existant.

> library(rbbt)
> bbt_write_bib("rbbt-bibliography.json")
Detecting citations in clipboard contents...
 Error: 'rbbt::bbt_write_bib("bibliography.json"' does not exist in current working directory ('....

Note the different spelling of the filename in the statement versus the error, that is at least a problem in the error display. Also, the error actually says that rbbt::bbt_write_bib("bibliography.json" does not exist, rather than showing a filename.

Still, when conforming to the default filename:

bbt_write_bib("bibliography.json")
Detecting citations in clipboard contents...
 Error: '"bibliography.json"' does not exist in current working directory ('.....

No real luck; note the error message has improved in this case.

OK, let's just create an empty file first:

> file.create("rbbt-bibliography.json")
[1] TRUE
> bbt_write_bib("rbbt-bibliography.json")
Error: Use `overwrite = TRUE` to overwrite file at 'rbbt-bibliography.json'

This error makes sense on its own, provided that the function would create the file if non-existant. So I would recommend to implement the latter. It can probably be done within the function, using file.create() and forcing overwrite=TRUE in that case.

paleolimbot commented 3 years ago

I totally believe you, but I can't replicate this behaviour on my computer. What does your devtools::session_info() say?

florisvdh commented 3 years ago

I'm having a hard time to reproduce it (should have prepared it better, you're absolutely right). I think it has to do with me downgrading openssl on my system today (in order to get the ODBC driver for ms sql server to work on Mint 20 (~Ubuntu 20.04) ......).

Right now, it says this (actually meaning, I can't use bbt_write_bib() right now :thinking: - although both rbbt RStudio addins still do work):

library(rbbt)
bbt_write_bib("rbbt-bibliography2.json", 
              keys = c("franklin_mapping_2009", "incollection-a3e3", "Kish_1965"))
#> Error:  not found

bbt_write_bib("bibliography.json", 
              keys = c("franklin_mapping_2009", "incollection-a3e3", "Kish_1965"))
#> Error:  not found

file.create("rbbt-bibliography2.json")
#> [1] TRUE
bbt_write_bib("rbbt-bibliography2.json", 
              keys = c("franklin_mapping_2009", "incollection-a3e3", "Kish_1965"),
              overwrite = TRUE)
#> Error:  not found

Created on 2020-09-07 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.2 (2020-06-22) #> os Linux Mint 20 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2020-09-07 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2) #> backports 1.1.9 2020-08-24 [1] CRAN (R 4.0.2) #> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.2) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2) #> devtools 2.3.1 2020-07-21 [1] CRAN (R 4.0.2) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.2) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2) #> hms 0.5.3 2020-01-08 [1] CRAN (R 4.0.2) #> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) #> jsonlite 1.7.1 2020-09-07 [1] CRAN (R 4.0.2) #> knitr 1.29 2020-06-23 [1] CRAN (R 4.0.2) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2) #> pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2) #> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2) #> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2) #> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2) #> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.2) #> rbbt * 0.0.0.9000 2020-09-07 [1] Github (paleolimbot/rbbt@dd2ac57) #> Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.2) #> readr 1.3.1 2018-12-21 [1] CRAN (R 4.0.2) #> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) #> reprex 0.3.0 2019-05-16 [1] CRAN (R 4.0.2) #> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2) #> rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.2) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2) #> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.2) #> tibble 3.0.3 2020-07-10 [1] CRAN (R 4.0.2) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.2) #> vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2) #> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.2) #> xfun 0.16 2020-07-24 [1] CRAN (R 4.0.2) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2) #> #> [1] /home/floris/lib/R/library #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```

However, this works in the shell:

$ curl http://localhost:23119/better-bibtex/json-rpc -X POST -H "Content-Type: application/json" -H "Accept: application/json" --data-binary '{"jsonrpc": "2.0", "method": "item.search", "params": ["franklin_mapping_2009"] }'
{"jsonrpc":"2.0","result":[{"id":"http://zotero.org/users/local/bw0Kd5AN/items/CISIVHR7","type":"book","abstract":"A comprehensive summary of species distribution modeling methods integrating ecological and statistical models with spatial data, and a framework for implementation.","event-place":"Cambridge","note":"Citation Key: franklin_mapping_2009","publisher":"Cambridge University Press","publisher-place":"Cambridge","title":"Mapping Species Distributions: Spatial Inference and Prediction","title-short":"Mapping Species Distributions","URL":"http://www.alibris.com/Mapping-Species-Distributions-Spatial-Inference-and-Prediction-Janet-Franklin/book/12212704","author":[{"family":"Franklin","given":"Janet"}],"accessed":{"date-parts":[["2017",3,1]]},"issued":{"date-parts":[["2009"]]},"library":"My Library","citekey":"franklin_mapping_2009"}],"id":null}

But this however does not (used by bbt_bib(), hence bbt_write_bib()):

$ curl http://localhost:23119/better-bibtex/json-rpc -X POST -H "Content-Type: application/json" -H "Accept: application/json" --data-binary '{"jsonrpc": "2.0", "method": "item.export", "params": ["franklin_mapping_2009", "biblatex"] }'
{"jsonrpc":"2.0","error":{"code":-32603,"message":"TypeError: citekeys.filter is not a function"},"id":null}

The latter is also the error behind the 'Error: not found' above:

library(rbbt)
bbt_call_json_rpc(
  "item.export",
  as.list(unique(as.character(c("franklin_mapping_2009", 
                                "incollection-a3e3")))),
  translator =  getOption("rbbt.default.translator", "biblatex"),
  library_id = getOption("rbbt.default.library_id", 1)
)
#> $jsonrpc
#> [1] "2.0"
#> 
#> $error
#> $error$code
#> [1] -32603
#> 
#> $error$message
#> [1] "TypeError: b.indexOf is not a function"
#> 
#> 
#> $id
#> NULL

Created on 2020-09-07 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.2 (2020-06-22) #> os Linux Mint 20 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2020-09-07 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2) #> backports 1.1.9 2020-08-24 [1] CRAN (R 4.0.2) #> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.2) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2) #> devtools 2.3.1 2020-07-21 [1] CRAN (R 4.0.2) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.2) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2) #> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) #> jsonlite 1.7.1 2020-09-07 [1] CRAN (R 4.0.2) #> knitr 1.29 2020-06-23 [1] CRAN (R 4.0.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2) #> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2) #> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2) #> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2) #> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.2) #> rbbt * 0.0.0.9000 2020-09-07 [1] Github (paleolimbot/rbbt@dd2ac57) #> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) #> reprex 0.3.0 2019-05-16 [1] CRAN (R 4.0.2) #> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2) #> rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.2) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2) #> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.2) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.2) #> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.2) #> xfun 0.16 2020-07-24 [1] CRAN (R 4.0.2) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2) #> #> [1] /home/floris/lib/R/library #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```

Any idea? I may need to search further.

paleolimbot commented 3 years ago

I wonder if your ID of "My Library" isn't the default (I assume it's 1 in the code). You can run rbbt::bbt_call_json_rpc("user.groups") to see what the default library ID is (and you can set this with options(rbbt.default.library_id = ...).

florisvdh commented 3 years ago

This seems to be normal?

rbbt::bbt_call_json_rpc("user.groups")
#> $jsonrpc
#> [1] "2.0"
#> 
#> $result
#> $result[[1]]
#> $result[[1]]$id
#> [1] 1
#> 
#> $result[[1]]$name
#> [1] "My Library"
#> 
#> 
#> 
#> $id
#> NULL

Created on 2020-09-08 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.2 (2020-06-22) #> os Linux Mint 20 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2020-09-08 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2) #> backports 1.1.9 2020-08-24 [1] CRAN (R 4.0.2) #> callr 3.4.3 2020-03-28 [1] CRAN (R 4.0.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.2) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2) #> devtools 2.3.1 2020-07-21 [1] CRAN (R 4.0.2) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.2) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2) #> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) #> jsonlite 1.7.1 2020-09-07 [1] CRAN (R 4.0.2) #> knitr 1.29 2020-06-23 [1] CRAN (R 4.0.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2) #> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2) #> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2) #> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2) #> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.2) #> rbbt 0.0.0.9000 2020-09-07 [1] Github (paleolimbot/rbbt@dd2ac57) #> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) #> reprex 0.3.0 2019-05-16 [1] CRAN (R 4.0.2) #> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2) #> rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.2) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2) #> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.2) #> usethis 1.6.1 2020-04-29 [1] CRAN (R 4.0.2) #> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.2) #> xfun 0.16 2020-07-24 [1] CRAN (R 4.0.2) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2) #> #> [1] /home/floris/lib/R/library #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```

I'll try whether re-upgrading openssl (from 1.1.1 (Bionic) to 1.1.1f (in Focal repo)) has anything to do with it. curl makes use of openssl, I think.

paleolimbot commented 3 years ago

That seems unlikely to have an effect (although might!). You could also try upgrading BetterBibTex?

florisvdh commented 3 years ago

Re-upgrading openssl didn't help indeed. Since Monday (before, I could use rbbt_write_bib()), I have been installing krb5-user msodbcsql17 mssql-tools unixodbc plus downgrading openssl (fwiw, the customized instructions).

I'm at latest version of BBT (5.2.66), also downgrading to e.g. 5.2.63 didn't help. Zotero is at 5.0.89.

I've taken a small adventure by resetting my system to Monday morning (using a weekly Timeshift snapshot). This does not include /home though. And the result is still the same:

$ curl http://localhost:23119/better-bibtex/json-rpc -X POST -H "Content-Type: application/json" -H "Accept: application/json" --data-binary '{"jsonrpc": "2.0", "method": "item.export", "params": ["franklin_mapping_2009", "biblatex"] }'
{"jsonrpc":"2.0","error":{"code":-32603,"message":"TypeError: citekeys.filter is not a function"},"id":null}

It makes me think something may be wrong in my own settings (/home: Zotero, BBT ...?), but AFAIK I didn't really change things (at least not manually). So weird.

Anyway, I could proceed discussing the above problem at BBT repo (the output of "item.export"); it should anyway not happen and maybe we'll find the reason then. Before I have this solved, I cannot reasonably try to reproduce the (original) issue here.

Regarding this (original) issue, it might be that there's also some effect of the RStudio version. At the time of posting this issue I had been down- and upgrading RStudio back and forth between 1.3.1073 and 1.4.779. I don't know which one I was using then, but it's easy to test it in both, once the BBT "item.export" works again. Speculation, obviously. BTW latest RStudio builds are interesting for their Visual Markdown Editing mode, which has citation capabilities as well. At BBT repo, we've been discussing about several alternatives, amongst which rbbt, here. This link goes directly to a comparative features table - of course you're welcome to comment on it.

retorquere commented 3 years ago

item.export expects an array of citekeys, not a single citekey as a string.

florisvdh commented 3 years ago

@paleolimbot Finally, good news. I found reasons for the above findings, basically it's because doing silly experiments can cause unexpected behaviour :smile: . Here is a short overview to complete the above story, but I've made a separate issue to better define what happens in specific circumstances.

But first things first:


In summary, the earlier printed output of bbt_write_bib() comes from two things: either clipboard content (= why I was having trouble in reproducing my first post :thinking: ), or the presence of a duplicate in the Zotero database. So it does get interesting.


Sources of encountered errors on my side:

  1. My 'test' Zotero database contained 1 item franklin_mapping_2009, 1 item Kish_1965 and 2 items of incollection-a3e3.
    • @retorquere the fact that I could get into a state with a duplicate bibtex key in Zotero, just by doing repeated imports of a bibtex file, may be something to think about. Anyway, curl http://localhost:23119/better-bibtex/json-rpc -X POST -H "Content-Type: application/json" -H "Accept: application/json" --data-binary '{"jsonrpc": "2.0", "method": "item.export", "params": [["franklin_mapping_2009","incollection-a3e3"], "biblatex"] }' will give an error "not found" because of the duplicate --> is this worth making an issue at BBT? Of course the responsability of avoiding duplicates lies with the user. If BBT isn't going to detect this at import stage (warning or so), I think it may still be worth throwing a more specific error message in json-rpc. These are just some ideas, I can't judge what is sensible or feasible.
    • @paleolimbot IMHO this situation does not seem like something rbbt should take care of. But ideally, we would get an error saying that citekey XXX is duplicated, rather than the above "Error: not found" message, which is rather misleading (+ it does not say what it is that is not found: two spaces surrounding nothing).
    • I feel that this topic needs some further discussion here; the title has been changed into that (but I leave ultimate decisions to you @paleolimbot ).
  2. My clipboard was not empty when running bbt_write_bib(). This is the subject of a new issue: #18
  3. The returned output of rbbt::bbt_call_json_rpc() above with an error '32603' - making me wrongly think there was an issue in the JSON-RPC call of BBT (retorquere/zotero-better-bibtex#1637) - was in fact due to me wrongly specifying arguments in R: the R function has no arguments 'translator' and 'library_id', these were just variables in the source code of bbt_bib(). Where was I.
retorquere commented 3 years ago

the fact that I could get into a state with a duplicate bibtex key in Zotero, just by doing repeated imports of a bibtex file, may be something to think about.

You can turn off key preservation on import, but if you keep that on, I'm taking that to signify you want to keep it, so I do, because you might be referring to them in something you wrote. A quality report should flag them as duplicates.

is this worth making an issue at BBT

Yes, if you have a suggestion how to deal better with it.

If BBT isn't going to detect this at import stage (warning or so)

Import translators cannot have any user interaction. I could put up a popup message whenever a duplicate key is saved, but it wouldn't be specific to imports.

florisvdh commented 3 years ago

Thanks for the explanation about the import workflow @retorquere.

if you have a suggestion how to deal better with it.

  1. Maybe this is one - depending on whether @paleolimbot likes it: if json-rpc would return a more specific error than "not found", in the case of duplicate keys (e.g. "Error: duplicate key(s) found: xxx, yyy, zzz"), then probably rbbt::bbt_call_json_rpc() could detect that error and return a more useful error message.

  2. A further alternative (which I like even more) - but it may involve more work in both json-rpc and rbbt - could be that json-rpc gives a warning instead of an error in case of duplicate keys, and proceeds with the first record for each duplicate key (the warning could make mention of that). rbbt could then detect the warning and pass it as a warning in R.

I'm just doing suggestions here, I leave it up to you @retorquere @paleolimbot to decide what is best / feasible. If things stay as they are, then I can live with that. The thing is, "Error: not found" (in rbbt) does not give a hint to the user what is wrong.

retorquere commented 3 years ago

Returning a more specific error wouldn't be hard.

WRT the 2nd option -- no problem doing the work, but there isn't really a "first" -- conceptually, the keys form a set, not a list. What gets returned would essentially be random. That doesn't seem right.

florisvdh commented 3 years ago

To make sure there's no misunderstanding: by 'duplicate key' I don't mean that a key is duplicated in the call (by the R user / rbbt) to json-rpc, but that a sole passed key occurs twice in the Zotero database. That triggers the error.

The 'first' (for each duplicate key) could perhaps use the 'date added' field from Zotero? (in which case taking the latest date seems more appealing) Do records have a numerical ID in the Zotero database, e.g. for use in case 'date added' is missing (which, normally, isn't)? Even if the latter approach would rather boil down to 'randomness', I think if the message clearly warns the user about responsability and an unpredictable result, then that could be fine I think.

Prototype warning:

Warning: duplicate key(s) found in Zotero: xxx, yyy, zzz. Trying to take the record with the latest 'date added' in Zotero. Results may be unexpected, please check. Better still, avoid duplicate citekeys in your database.

retorquere commented 3 years ago

The warning is sufficiently descrptive, no problem there, but I really dislike handing back the first/last/whatever from any arbitrary ordering. We're talking about citations here, I can't really decide on behalf of the user that one item should be "more likely" to be the one wanted.

I could throw a more descriptive error, or just return successfully with an error state, indicating what keys were duplicate. Consumers of the API could then choose to retry the request without those duplicates. An other option is to return an array of objects for duplicate items in place of a single item, and the consumer of the API would again be free to deal with those duplicates, perhaps by prompting the user to choose.

florisvdh commented 3 years ago

I understand; thanks for the explanation and the ideas @retorquere! I'm looking forward to the feedback from @paleolimbot .

paleolimbot commented 3 years ago

I'd prefer to keep this package as a simple wrapper for the Better BibTex API, which is excellent. As @retorquere indicated, guessing the user intent when there are duplicate and/or missing keys is hard, and I'd prefer not to take on maintaining the code that makes that guess.

That said, I think it would be really useful if the item.export endpoint returned all possible problems with the input, so the user could correct this in one go. For example:

Found 3 missing keys ('key1', 'key2', and 'key3') and 1 duplicate key ('key4')

I know that's probably a bit of work on @retorquere's end, but I think there are few cases where a consumer of the API won't want to know how to make the API call succeed the next time. From a performance standpoint, it is problematic to repeatedly call the API to see if a key exists (my thesis has hundreds of citations, and calling the API once per citation takes about 10 seconds).

retorquere commented 3 years ago

Wouldn't be hard perse, but I don't know how to return it. item.export returns a block of text. There isn't really a side channel to deliver the extra data.

retorquere commented 3 years ago

One option would be to return a JSON object when extra ~headers~ parameters are passed.

paleolimbot commented 3 years ago

I was just thinking an error message, replacing Error: ... not found, so no extra data needed. Alternatively, another method (item.exists maybe?) that gives some kind of information for each key (valid, duplicated, missing)? I'm new to all of this...

retorquere commented 3 years ago

Either is simple to do. I can return a JSON-formatted error with details.

paleolimbot commented 3 years ago

My preference would be an error message with item.export that mentions all problematic keys...I probably won't use any JSON returned with the error message here but I imagine others could find this useful. But I trust your instincts!

retorquere commented 3 years ago

build 7136 will give you more specific error messages.

florisvdh commented 3 years ago

@paleolimbot thank you for the feedback!

Couldn't resist to test this already:

$ curl http://localhost:23119/better-bibtex/json-rpc -X POST -H "Content-Type: application/json" -H "Accept: application/json" --data-binary '{"jsonrpc": "2.0", "method": "item.export", "params": [["franklin_mapping_2009","Kish_1965","incollection-a3e3","book-a2e0","nonexistant_2030","nonexistant_2040"], "biblatex"] }'
{"jsonrpc":"2.0","error":{"code":-32602,"message":"not found: nonexistant_2030, nonexistant_2040\nduplicates found: incollection-a3e3, book-a2e0"},"id":null}

:tada: :smile:

paleolimbot commented 3 years ago

@retorquere: That looks awesome! Thank you!!

@florisvdh: Thanks for bringing this up!!

retorquere commented 3 years ago

Is this good to go? Then I'll merge it into a new release.

paleolimbot commented 3 years ago

I'm happy with this and trust @florisvdh's test!

retorquere commented 3 years ago

Alright, scheduled for release, should be out in 15 mins or so.

florisvdh commented 3 years ago

Wonderful, the result is indeed much more informative in R:

library(rbbt)
bbt_bib(
  keys = c("franklin_mapping_2009",
           "Kish_1965",
           "incollection-a3e3",
           "book-a2e0",
           "nonexistant_2030",
           "nonexistant_2040")
)
#> Error: not found: nonexistant_2030, nonexistant_2040
#> duplicates found: incollection-a3e3, book-a2e0

Created on 2020-09-24 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.2 (2020-06-22) #> os Linux Mint 20 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2020-09-24 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2) #> backports 1.1.10 2020-09-15 [1] CRAN (R 4.0.2) #> callr 3.4.4 2020-09-07 [1] CRAN (R 4.0.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.2) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2) #> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.2) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2) #> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) #> jsonlite 1.7.1 2020-09-07 [1] CRAN (R 4.0.2) #> knitr 1.30 2020-09-22 [1] CRAN (R 4.0.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2) #> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2) #> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2) #> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2) #> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.2) #> rbbt * 0.0.0.9000 2020-09-07 [1] Github (paleolimbot/rbbt@dd2ac57) #> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) #> reprex 0.3.0 2019-05-16 [1] CRAN (R 4.0.2) #> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2) #> rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.2) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2) #> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.2) #> usethis 1.6.3 2020-09-17 [1] CRAN (R 4.0.2) #> withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2) #> xfun 0.17 2020-09-09 [1] CRAN (R 4.0.2) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2) #> #> [1] /home/floris/lib/R/library #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```

I think it could be useful if the errors would be warnings. And hence, that the result for unique keys is still returned. (Above, we have only discussed behaviour for the duplicated/missing keys)

@retorquere @paleolimbot how do you feel about the situation that BBT would still present the results for non-duplicated keys? This also interacts with ideas in #15 I think.

retorquere commented 3 years ago

It'd mean a breaking change to this endpoint. And with JSON exports I could add the errors as an extra property, but with other formats (such as bibtex) I don't have that. I'm open to do this, but would prefer to enable the new behavior with an extra parameter.

paleolimbot commented 3 years ago

I like that bbt errors here! See other thread for rbbt stuff.