zotero / translation-server

A Node.js-based server to run Zotero translators
Other
121 stars 50 forks source link

Internal Server Error response for JSTOR "stable" links #87

Closed mvolz closed 5 years ago

mvolz commented 5 years ago

curl -d 'https://www.jstor.org'Content-Type: text/plain' http://127.0.0.1:1969/web

(3)(+0000000): Translators initialized with 541 loaded

(3)(+0000004): Listening on 0.0.0.0:1969

(3)(+0006978): HTTP GET https://www.jstor.org/stable/43203215

(1)(+0000213): Error: HTTP request to https://www.jstor.org/stable/43203215 rejected with status 403

  InternalServerError: An error occurred retrieving the document
      at Object.throw (/home/marielle/code/zotero/node_modules/koa/lib/context.js:97:11)
      at module.exports.WebSession.handleURL (/home/marielle/code/zotero/src/webSession.js:196:19)
      at process.internalTickCallback (internal/process/next_tick.js:77:7)

I suspect this may be an issue with how JSTOR redirects its stable links.

dstillman commented 5 years ago

JSTOR is just aggressive about blocking. The underlying message for a 403 is Our systems have detected unusual traffic activity from your network. Please complete this reCAPTCHA to demonstrate that it's you making the requests and not a robot., and I can get that pretty easily by making some sample curl requests, even with a browser UA. But I just tried the above URL from an active t-s installation and it worked fine, so there's not really anything to fix here.

mvolz commented 5 years ago

Hmm, it blocks me locally and it's also blocking us in prod. Thanks for your help!