zotero / translation-server

A Node.js-based server to run Zotero translators
122 stars 51 forks source link

Issues in production - memory exhaustion, segfaults #68

Closed mvolz closed 5 years ago

mvolz commented 5 years ago

We've been having some issues with this in production, namely memory exhaustion and also segfaults. Unfortunately I can't provide much more information /details than that at present. Have you been having similar issues?

We had memory exhaustion issues with the older version as well, it just filled up more slowly.

Probably addressing issue #2 would be a start helping to diagnose this.

dstillman commented 5 years ago

We're actually running it in AWS Lambda, so our environment is pretty different.

Are you on the latest version? What does your environment look like (Node version, available memory)? Can you tell how long from start to OOM, and approximately how many requests it's fulfilling in that time?

We'll look into whether we can reproduce any memory leaks, but it'd be good to make sure we're looking in a similar environment.

dstillman commented 5 years ago

We've been able to reproduce OOM conditions by pointing translation-server at very large files (e.g., ISOs, or multiple concurrent large PDFs), so that might be what you're seeing. We have a fix in progress that should be ready shortly.

dstillman commented 5 years ago

OK, we've limited the upstream size to 50 MB (which is probably much bigger than it needs to be, so we could consider lowering that further), and we now reject (and avoid memory-intensive parsing attempts for) documents that aren't HTML or XML. Those will both now return 400.

There may be other things we can do, but this will hopefully address most of the problems you were seeing.

mvolz commented 5 years ago

Great, we'll give it a try after the holiday :).

mvolz commented 5 years ago

Looking much better now, thank you so much! Loading those large pdfs that was really knocking things over. I'm closing this for now :).