Closed dhimmel closed 5 years ago
Other BigThink URLs also seem slow, although perhaps not as slow, when queried via translation-server (but not via other means):
curl \
--header "Content-Type: text/plain" \
--data 'https://bigthink.com/politics-current-affairs/new-estimate-deaths-from-us-nuclear-tests' \
'https://translate.manubot.org/web'
The default timeout value on Nginx is 60 seconds. We have increased it to 300 seconds and see whether the timeout error will be gone.
src/webSession.js
sets a variable SERVER_TRANSLATION_TIMEOUT
to 30, but it doesn't seem to be used anywhere.
When I run a translation-server locally, and then run:
curl \
--header "Content-Type: text/plain" \
--data 'https://bigthink.com/neurobonkers/a-pirate-bay-for-science' \
'http://127.0.0.1:1969/web'
The response is slow but not that slow... perhaps like 5 seconds which is acceptable compared to 60+ seconds we were seeing on our remote server.
Make sure you're using the latest version of translation-server.
Translating that page takes 0.6 seconds for me.
$ time ./translate_url 'https://bigthink.com/politics-current-affairs/new-estimate-deaths-from-us-nuclear-tests'
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 1969 (#0)
> POST /web HTTP/1.1
> Host: 127.0.0.1:1969
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Type: text/plain
> Content-Length: 87
>
* upload completely sent off: 87 out of 87 bytes
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=utf-8
< Content-Length: 452
< Date: Fri, 14 Dec 2018 22:41:05 GMT
< Connection: keep-alive
<
* Connection #0 to host 127.0.0.1 left intact
[{"key":"56PVEUX8","version":0,"itemType":"webpage","creators":[],"tags":[],"title":"At least 340,000 Americans died from radioactive fallout between 1951 and 1973","websiteTitle":"Big Think","date":"2018-12-14T16:32:16","url":"https://bigthink.com/politics-current-affairs/new-estimate-deaths-from-us-nuclear-tests","abstractNote":"Domestic nuclear testing wreaked havoc on thousands of families.","language":"en","accessDate":"2018-12-14T22:41:05Z"}]
real 0m0.687s
user 0m0.006s
sys 0m0.006s
I updated to the latest translation-server, using the following command (& output):
translate@translation-server:~/translation-server$ git pull --ff-only --recurse-submodules
remote: Enumerating objects: 7, done.
remote: Counting objects: 100% (7/7), done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 4 (delta 3), reused 3 (delta 3), pack-reused 0
Unpacking objects: 100% (4/4), done.
From https://github.com/zotero/translation-server
3c80836..cd8b177 master -> origin/master
Fetching submodule modules/translators
From https://github.com/zotero/translators
f9de522..5e4b941 master -> origin/master
Fetching submodule modules/zotero
Fetching submodule modules/zotero/chrome/content/zotero/locale/csl
Fetching submodule modules/zotero/styles
Fetching submodule modules/zotero/translators
From git://github.com/zotero/translators
f9de522..5e4b941 master -> origin/master
Updating 3c80836..cd8b177
Fast-forward
package-lock.json | 6 +++---
package.json | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)
I restarted translation-server. Then I ran:
time ./translate_url 'https://bigthink.com/politics-current-affairs/new-estimate-deaths-from-us-nuclear-tests'
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 1969 (#0)
> POST /web HTTP/1.1
> Host: 127.0.0.1:1969
> User-Agent: curl/7.58.0
> Accept: */*
> Content-Type: text/plain
> Content-Length: 87
>
* upload completely sent off: 87 out of 87 bytes
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=utf-8
< Content-Length: 452
< Date: Sat, 15 Dec 2018 00:50:19 GMT
< Connection: keep-alive
<
* Connection #0 to host 127.0.0.1 left intact
[{"key":"NMX64QGU","version":0,"itemType":"webpage","creators":[],"tags":[],"title":"At least 340,000 Americans died from radioactive fallout between 1951 and 1973","websiteTitle":"Big Think","date":"2018-12-14T16:32:16","url":"https://bigthink.com/politics-current-affairs/new-estimate-deaths-from-us-nuclear-tests","abstractNote":"Domestic nuclear testing wreaked havoc on thousands of families.","language":"en","accessDate":"2018-12-15T00:50:19Z"}]
real 0m12.638s
user 0m0.007s
sys 0m0.008s
12 seconds total, so I think the problem is still there.
Did you run npm i
?
Running npm install
fixed it. The output of that command was:
added 7 packages from 10 contributors, removed 16 packages, updated 10 packages and audited 450 packages in 9.689s
found 0 vulnerabilities
Now both BigThink URLs complete in ~1 second.
Interesting that the having outdated dependencies would cause translation-server to slow down but still work. Thanks @dstillman for the insight. I've updated our server update instructions at https://github.com/greenelab/manubot/issues/82#issuecomment-447037867 to include the npm install
command.
Post update, our CI tests showed that the following web query is no longer working:
curl --verbose \
--header "Content-Type: text/plain" \
--data 'https://nyti.ms/1NuB0WJ' \
'https://translate.manubot.org/web'
Here is the corresponding translation-server stdout log:
(3)(+0197565): HTTP GET https://nyti.ms/1NuB0WJ
(1)(+0015219): Error: ESOCKETTIMEDOUT
Error: ESOCKETTIMEDOUT
at ClientRequest.<anonymous> (/home/translate/translation-server/node_modules/request/request.js:812:19)
at Object.onceWrapper (events.js:313:30)
at emitNone (events.js:106:13)
at ClientRequest.emit (events.js:208:7)
at TLSSocket.emitTimeout (_http_client.js:711:34)
at Object.onceWrapper (events.js:313:30)
at emitNone (events.js:106:13)
at TLSSocket.emit (events.js:208:7)
at TLSSocket.Socket._onTimeout (net.js:420:8)
at ontimeout (timers.js:482:11)
Interesting that the having outdated dependencies would cause translation-server to slow down but still work.
It was using an older JSDOM that was much slower.
That URL works for me in 2.4 seconds.
The NYT URL command from https://github.com/zotero/translation-server/issues/63#issuecomment-447530844 now works for me. Perhaps it was an upstream error or we had exceeded some NYT rate limit.
We noticed that we were getting intermittent
504 Gateway Time-out
errors for some translation-server queries in https://github.com/greenelab/manubot/pull/87#issuecomment-447393765. We've configured translation-server behind an nginx reverse proxy. When the error occurs, the response from https://translate.manubot.org is:This can be triggered by running the following command many times:
Here is are the log output from the translation-server (comment indicates where the delay happens, when we watch live with
tail --follow
):If I run
curl https://bigthink.com/neurobonkers/a-pirate-bay-for-science
locally or on our translation-server's system, there is no delay getting the response. Therefore, I don't know why HTTP GET is taking so long when performed by translation-server. Perhaps something else is going on that is causing the delay? CCing @dongbohu from our team.