egh / zotxt

zotxt: a Zotero extension for text
http://gitlab.com/egh/zotxt
GNU General Public License v3.0
326 stars 16 forks source link

HTTP requests to zotxt reply "Request not allowed" for zotxt v5.0.2 and Zotero >= v5.0.71 #11

Closed odkr closed 5 years ago

odkr commented 5 years ago

I'm using Zotero v5.0.72 and zotxt 5.0.2.

When I try to retrieve citation data for an item via a GET request to http://127.0.0.1:23119/zotxt/items?easykey=<KEY>, Zotero responds with "Request not allowed". I can reproduce this for other endpoints.

I have recently upgraded to Zotero (the error also occurs in v5.0.71), which may (or may not) have caused the issue. I'm using the zotero-bibliography setting, so I can’t tell when things stopped working.

Do you have any idea what's going on?

odkr commented 5 years ago

Apparently, this behaviour is to specification for Zotero >= 5.0.71, but it only affects "browsers". See https://github.com/retorquere/zotero-better-bibtex/issues/1233.

I can confirm that curl works fine:

$ curl http://127.0.0.1:23119/zotxt/items?easykey=haslanger:2012debunking
[
  {"id":"haslanger2012SocialConstructionDebunking","type":"chapter","title":"Social Construction: The ‘Debunking’ Project","container-title":"Resisting Reality: Social Construction and Social Critique","publisher":"Oxford University Press","page":"113–138","ISBN":"978-0-19-989262-4","title-short":"Social Construction","author":[{"family":"Haslanger","given":"Sally"}],"issued":{"date-parts":[[2012]]},"publisher-place":"Oxford","original-date":{"date-parts":[[2003]]}}
]

However, when I try to receive data via Pandoc's pandoc.mediabeg.fetch, it fails. I'll investigate what's going on.

spacekitteh commented 5 years ago

It fails with the emacs zotxt plugin, too. Curl from powershell returns a 403.

odkr commented 5 years ago

The change that introduces this behaviour in Zotero is, apparently, https://github.com/zotero/zotero/commit/2603373b860acb555062c01a5bb434d6c712aa3e. The code suggests that, mostly, it's about as what HTTP user agent the utility that makes the GET request identifies. I'm surprised that cURL fails. Try to use the --user-agent option and set it something that doesn’t start with "Mozilla/" (or set the HTTP header "X-Zotero-Allowed-Request" to 1).

@egh, I’m a bit at a loss how to implement this behaviour in pandoc-zotxt.lua. All I have is Pandoc's fetch function, which doesn't allow me to set HTTP headers. Zotero or Pandoc would have to be changed to get the Lua script to work again. So, for the time being, it's back to the Python version.

odkr commented 5 years ago

I've opened an issue for pandoc-zotxt.lua, where I summarise what I found out so far. It appears endpoints can be whitelisted to disable the security mechanism. I'm not sure whether plugins can do that though; I'm not familiar enough with Zotero's internals.

odkr commented 5 years ago

I’ve just found that passing --request-header User-Agent:"Pandoc/2" to pandoc works around the problem and allows to contintue using the Lua filter. There's no way to do this programmatically, as far as I can see.

odkr commented 5 years ago

I've filed a report with Zotero. Apparently, the problem isn't that it mistakes Pandoc for a browser, but that Pandoc (and the emacs plugin, presumably) don't set the User-Agent header, which raises an exception in Zotero. See https://forums.zotero.org/discussion/78502/http-user-agents-that-dont-identify-fail-for-zotero-v5-0-71

odkr commented 5 years ago

The Zotero team have responded. They promise to solve this with v5.0.73.

egh commented 5 years ago

@odkr Thank you so much for this. I'm sorry, I've been on vacation for a few weeks so I'm just catching up.

ewa commented 4 years ago

I just ran into a similar-looking issue under Zotero 5.0.87, but it turned out to be PEBKAC thing: If a person has installed the emacs package (zotxt-emacs), but not zotxt the Zotero add-on (that is, this project), they get the same error messages:

$ curl -v -v -v http://127.0.0.1:23119/zotxt/version
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 23119 (#0)
> GET /zotxt/version HTTP/1.1
> Host: 127.0.0.1:23119
> User-Agent: curl/7.64.1
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 404 Not Found
< X-Zotero-Version: 5.0.87
< X-Zotero-Connector-API-Version: 2
< Content-Type: text/plain
<
No endpoint found
* Closing connection 0

I came to this from the Emacs side and didn't realize there even was an add-on for Zotero (in addition to the package for Emacs) involved until I started trying to debug this. So, consider this documentation / Google-fodder for the next person who makes the same mistake.

egh commented 4 years ago

@ewa Thank you! More recent versions of zotxt-emacs are intended to display a more helpful error message, namely "Zotxt version endpoint not found; is Zotero running and zotxt installed?" Did you not see that error?