odkr / pandoc-zotxt.lua

Pandoc filter that looks up bibliographic data for citations in Zotero.
MIT License
48 stars 2 forks source link

Doesn't work well with Zotero >= v5.0.71 #1

Closed odkr closed 5 years ago

odkr commented 5 years ago

Zotero v5.0.71 introduces a security mechanim that aims to block 'browsers' from accessing its API. It appears to define 'browser' as any HTTP user agent that identifes as "Mozilla"; more precisely, the identification string of which starts with "Mozilla/" (lines 439–453 in server_connector.js). So, HTTP requests to Zotero's API have to be made either via an HTTP user agent that doesn't identify as "Mozilla" or, alternatively, with the HTTP header Zotero-Allowed-Request set. (judging from server_connector.js).

By default, Pandoc doesn't set any request headers. Still, it appears Zotero treats it as 'browser'. This isn’t, however, because Pandoc identifies as Mozilla, but because Pandoc doesn't identify as any user agent. This triggers:

Error: this.headers['user-agent'] is undefined
Source File: chrome://zotero/content/xpcom/server.js
Line: 434

However, pandoc-zotxt.lua uses Pandoc's pandoc.mediabag.fetch function to retrieve data via HTTP (in MediaBag.hs, which ultimately calls openURL in Class.hs). And pandoc.mediabag.fetch (and openURL), do not allow to set HTTP headers. Apparently, Pandoc sets headers for HTTP requests globally (see CommonState and stRequestHeaders in Class.hs). PANDOC_STATE is read-only. And PANDOC_STATE.request_headers['Zotero-Allowed-Request'] = 1 has no effect (PANDOC_STATE.request_headers is still empty afterwards).

(And pandoc.mediabag.fetch returns ("", "") (i.e., the empty string two times), rather than either an error message from Zotero or (nil, nil), as the documentation would suggest. This makes debugging harder.)

Lua is just a thin layer over ANSI C. There is no other library or function to connect to a socket.

This explains https://github.com/egh/zotxt/issues/11.

Fixing this will require a change in Zotero or Pandoc.

odkr commented 5 years ago

Passing --request-header User-Agent:"Pandoc/2" to pandoc works around the problem (--request-header Zotero-Allowed-Request:X and --request-header X-Zotero-Connector-API-Version:0 don't, because the bug is caused by the missing User-Agent header. Passing --request-header User-Agent:"Mozilla/X" and --request-header Zotero-Allowed-Request:X works, too).

odkr commented 5 years ago

Reported to Zotero. See https://forums.zotero.org/discussion/78502/http-user-agents-that-dont-identify-fail-for-zotero-v5-0-71

odkr commented 5 years ago

The Zotero team promised to fix this with v5.0.73. I updated the manual page to adress this (commit https://github.com/odkr/pandoc-zotxt.lua/commit/30d8edbf9bfc8abb9f9fe4561cb421a0354f04cd).