elifesciences / jats-validator-docker

JATS4R Schematron validation as a Docker web service
https://jats-validator.onrender.com
0 stars 0 forks source link

Container consumes huge amount of memory #3

Open NuclearRedeye opened 4 years ago

NuclearRedeye commented 4 years ago

There is a memory leak somewhere inside the container that is causing the memory consumption to ballon until all available memory is exhausted and the container stalls.

Steps to reproduce

It's quite straight forward to reproduce the problem by running the container, and then using curl to hit it with requests until you get a response other than a HTTP 200 SUCESS.

  1. Start the container using the following.
    docker run --rm --memory="2.5g" --memory-swap="2.5g" -p 4000:80 jats-validator
  2. In another terminal, monitor the memory using...
    docker stats <container id>
  3. In another terminal, use cURL to hit it with requests, e.g.
    curl --write-out %{http_code} --silent --output /dev/null -F schematron=elife-final -F xml=@elife45905.xml http://localhost:4000/schematron/

Through the stats terminal, you will see the memory follow a sawtooth pattern, but with leaks and hence it won't take too many requests before you exhaust the memory. When that does happen, the OOM killer in the container will kick in, and the behaviour becomes undefined depending on what process(es) it decides to kill. In any case, the request fails.

NuclearRedeye commented 4 years ago

The issue is related to the JET runtime inside the container, and the way that the PHP API interacts with the SaxonHE/C API. It's not cleaning up processes that are created, and hence the process count and memory consumption raises with each request. You can view the bug at https://saxonica.plan.io/issues/2055

NuclearRedeye commented 4 years ago

Bug 2055 was supposed to be resolved in Saxon 1.2.1, but having updated the code to use that version I still see the issue. Seems that fix causes another issue with PHP, hence it was reverted. See https://saxonica.plan.io/issues/4371

NuclearRedeye commented 4 years ago

Have pushed the v1.2.1 work to wip/update-to-saxonhe-1.2.1, it's technically no worse than before so 'could' be merged, but holding fire for now. See https://github.com/elifesciences/jats-validator-docker/tree/wip/update-to-saxonhe-1.2.1