hbz / lobid

Linking Open Bibliographic Data
https://lobid.org/
Eclipse Public License 2.0
15 stars 4 forks source link

Memory leak in Play 2.3.4 #188

Open dr0i opened 9 years ago

dr0i commented 9 years ago

Play's Promise is memory leaking under certain circumstances. This in conjunction with delivering loads of data (> 1GB, see #176). It's an other issue as the one described here, and i haven't found it documented anywhere yet. Further investigations are needed, more heap memory analyzing with MAT.

dr0i commented 8 years ago

Found that the problem is indeed the faster building-of-data than serving-data. With enabling #234 the serving will have a ratio of 1/15 (using compressed NTriples) and thus the client getting the data is no more faster than the server delivering the data. Closing this ticket.

dr0i commented 8 years ago

Reopening. My last comment is not true. There is a memory leak in Promise as this heap profiling done with MAT in eclipse shows: memoryleakinpromise This repeats more than 4 M times and thus takes the 6 GB RAM.

See also this discussion:

Each call to loop adds another link to the chain, from fm(n) back to fm(n-1). The chain remains in place until the future produced by the final iteration of loop completes. Make n big enough and an OOME is the result. :(

With Promise being broken, #176 cannot be working for data > RAM of play instance.

dr0i commented 7 years ago

Would be nice if this would be fixed with the new web frontend, se https://github.com/hbz/lobid-resources-web.

fsteeg commented 7 years ago

Does this issue still appear for us in production? In the thread linked [1] from the thread that you linked [2] it says the issue was fixed in Scala 2.10.3. For lobid we currently use 2.11.2. Could we run lobid-staging with less RAM and see what happens? Or is it already configured with less and show show the problem?

[1] https://groups.google.com/forum/#!topic/play-framework/fHTe966ni5o [2] https://groups.google.com/forum/#!topic/play-framework-dev/58VZD-YXdJw

dr0i commented 7 years ago

When have you upgraded to the newer scala version?

fsteeg commented 7 years ago

Looking at https://github.com/hbz/lobid/commits/master/project/Build.scala it was on Sep 26, 2014 (see https://github.com/hbz/lobid/commit/ee6c24751a0ab693c3a7aeaa76308246e5bdffee), so before you opened the bug. So it should still be an issue. To work on this I'd need some details on how to reproduce and how/where you changed the RAM settings.

dr0i commented 7 years ago

Reproduce like this:

curl --header "Accept: application/ld+json" "http://lobid.org/resource?q=*&scroll=19000101"

RAM ist set via monit (java Xmx).

fsteeg commented 7 years ago

I can't reproduce the problem, it seems to work. Here is what I did:

  1. Removed -Xmx8192m,-Xms6144m for process lobid-staging in /etc/monit/conf.d/play-instances.rc.
  2. Reloaded monit config with sudo /etc/init.d/monit reload.
  3. Restarted lobid-staging via sol@quaoar1:~/git/lobid-staging$ sh restart.sh lobid-staging.
  4. Verified it uses the default 1GB heap via ps aux | grep 7001.
  5. Following log via sol@quaoar1:~/git/lobid-staging$ tail -f target/universal/stage/logs/application.log.
  6. Locally call curl --header "Accept: application/ld+json" "http://test.lobid.org/resource?q=*&scroll=19000101" >> all.json.

I'm now at 2GB received and it seems to be running smoothly.

fsteeg commented 7 years ago

I aborted the download at 10GB, nothing unusual in the log. I think we can close this.

dr0i commented 7 years ago

Wait, I remember dimmly that it makes a difference from where you call the url (re: serve data slower as building data). I will try tomorrow again. BTW, it's not possible to stop play to build and serve the stream even if you break the crawl from client side. If this is no issue anymore that would be great (though of course I wonder why it now works) and we could get rid of the restriction of allowing only one scroll at the same time.

dr0i commented 7 years ago

Doing curl --header "Accept: text/plain" "http://test.lobid.org/resource?q=*&scroll=19000101" >> all.ntriples

outside the hbz-network it results in:

Doc 1.558.250 ,sec:15.232 2017-02-07 15:31:40,869 - [WARN] - from application in pool-12-thread-1 nMemory too low. Canceling request!

The play instance on the staging system had 1GB Ram allocated. 1.5M docs (~ 10% of all docs) in 4 hours and then crashing is not really good (there are slowing routines to possibly free some memory). To be tested:

fsteeg commented 6 years ago

If this is still a problem its currently not causing a concrete issue, removed bug tag (no current prio).