zotero / translation-server

A Node.js-based server to run Zotero translators
Other
122 stars 52 forks source link

Fix gzip problem and missing package.json #52

Closed mrtcode closed 5 years ago

akosiaris commented 5 years ago

For what is worth, this change broke our installation. After some git bisecting trying to figure out the culprit when testing this (thankfully before upgrading to production) we ended up in this change.

Removing the gzip: true, line fixes the issue for us. The root cause of the issue seems to be more complicated of course as it seems to be some weird issue between zlib1g 1:1.2.8.dfsg-5 and nodejs 10.4.0~dfsg-1+wmf2 on Debian Stretch. Node 8.11.1~dfsg-2~bpo9+1 does not exhibit this behavior, neither does zlib1g 1:1.2.11.dfsg-1 (the one in Debian Buster). Our guess is that neither do nodejs versions released from upstream as they bundle zlib.

We are solving this in our setup for now by reverting this commit in order to proceed, but we are hoping for some more information as to what exact issue was fixed by this commit, as there is none in the commit message nor the PR.

mrtcode commented 5 years ago

That's strange. gzip: true flag was necessary because some websites i.e. amazon.com, don't like requests without accept-encoding: gzip and trigger an anti-bot protection. And of course reducing response size isn't a bad idea either.

akosiaris commented 5 years ago

That's strange. gzip: true flag was necessary because some websites i.e. amazon.com, don't like requests without accept-encoding: gzip and trigger an anti-bot protection. And of course reducing response size isn't a bad idea either.

True. But I guess the inverse will also be true for some websites. In fact if https://w3techs.com/technologies/details/ce-gzipcompression/all/all is to be believed roughly 25% of websites out there don't use gzip compression. That number sounds quite large, it may warrant more investigation and corrective actions taken.

Anyway as I 've said in my comment, the bug doesn't seem to be in zotero, but rather some weird interaction between specific zlib and nodejs versions.

Thanks for sharing the bit about amazon.com not liking requests without accept-encoding: gzip.

mrtcode commented 5 years ago

If a website doesn't support a gzip compression, the flag doesn't have any effect on it either. request.js is only decoding the response content if there is a Content-Encoding: gzip response header. Also all browsers are requesting gzipped content by default. So with this flag we are just mimicking what browsers are doing anyway. Strange why 'request.js' isn't doing this by default.

akosiaris commented 5 years ago

If a website doesn't support a gzip compression, the flag doesn't have any effect on it either.

Yes indeed. accept-encoding: gzip just advertises to the server that the client is capable of gzip content and then content negotiation happens. Defaulting to true has no ill-effects.

request.js is only decoding the response content if there is a Content-Encoding: gzip response header. Also all browsers are requesting gzipped content by default. So with this flag we are just mimicking what browsers are doing anyway. Strange why 'request.js' isn't doing this by default.

Reading the code I think that in fact, setting gzip: true does something slightly sinister. Per https://github.com/request/request/blob/df346d8531ac4b8c360df301f228d5767d0e374e/request.js#L386 it uses the gzip flag to advertise it is also capable of deflate compression which is older and very seldom used. Overall it's probably ok, it's just weird that setting a flag called gzip also enables a different compression algorithm. Also note that Brotli (which is newer and not yet widely deployed) is not supported in request.js.