Closed ghost closed 20 years ago
Comment by Vilius@BEST.eu.org on 24 Jun 2004 08:55 UTC From: Cristian Bogdan Sent: Monday, June 21, 2004 23:51 To: Priit Potter Cc: itc-install@ Subject: Re: Karamba crashes with new makumba version
this looks very serious. may be a memory leak. or db connections which are never closed.
could probably have been discovered with stress tests...
unfortunately, in the short term, i won't be able to work much on this :(
suggestions:
i guess going back to 0.5.9 without reconverting the data to myisam (0.5.9 doesn't care about the table type) is also an option but i'd have to look more at that.
cristi
Comment by Vilius@BEST.eu.org on 24 Jun 2004 08:56 UTC From: Cristian Bogdan Sent: Tuesday, June 22, 2004 0:27 To: Priit Potter Cc: itc-install@ Subject: Re: Karamba crashes with new makumba version
i just reloaded production. heap usage went down from 155Mb to 49 Mb. as expected "number of connections open" went to 10 (from 26). makumba makes a pool of 10 connections at startup or reload.
http://private.best.eu.org/systemInfo.jsp
so reloading definately helps. not an elegant solution, but can be automated nicely (a cron that does regularly: wget http://itc:password@tequila.best.eu.org:17070/manager/reload?path=/).
NOTE: replace password with karamba for the above link to work. i removed the pass intentionally for ppl who click on the link accidentaly.
reload works for short term, will investigate in longer term (1 month)
please watch systemInfo (esp System Information and Makumba) a bit more and that info will help finding the problem. my main suspect now is the number of connections open. see to what values it rises.
please note: it is normal for the values starting with "data definitions parsed" to grow and to stabilize at some value.
note also that there may simply be a page that does stupid stuff and eats memory...
cristi
Comment by Vilius@BEST.eu.org on 24 Jun 2004 08:57 UTC Gwen reported PA down about 10.25am today.
I have tried to reload:
Karamba$cdk Karamba$ ant reload
It does not do anything for 5 min. I cancel the command.
I see that as contradictory to Cristi's statement - "reload helps".
I do "ant stopTomcat", then "ant tomcat".
It does nothing good.
Then I do I do "ant stopTomcat", it produces output like pasted below.
Thenk i do "killall -9 java"
And then "ant tomcat&"
And it works nicely.
Heap is 17 MB after 5 minutes after start.
Result of "ant stopTomcat":
karamba@tequila:~/sources/karamba$ ant stopTomcat Buildfile: build.xml
stopTomcat:
stopTomcat:
[Catalina.stop: java.net.ConnectException: Connection refused
stopTomcat java.net.ConnectException: Connection refused
[ at java.net.PlainSocketImpl.socketConnect(Native Method)
stopTomcat at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:305)
[ at java.net.PlainSocketImpl.connectToAddress
(PlainSocketImpl.java:171)
stopTomcat at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:158)
[ at java.net.Socket.connect(Socket.java:426)
stopTomcat at java.net.Socket.connect(Socket.java:376)
[ at java.net.Socket.
Comment by Vilius@BEST.eu.org on 24 Jun 2004 10:23 UTC karamba$ crontab -e
Statistics are saved at: http://stats.best.eu.org/karamba-systeminfo/ (thanks gwen, for link)
Comment by @cristianbogdan on 24 Jun 2004 10:42 UTC comment 3: 17 Mb after server restart is normal. It's the memory it takes for the java.lang, org.makumba and tomcat classes.
We are trying to install optimizeIt at IPLab so we can investigate the memory use of a trial tomcat with the same config
Other things that can be done right away:
Comment by @cristianbogdan on 25 Jun 2004 00:37 UTC i just checked on production and had to restart it again. i really think a more permanent solution should be put in place.
can this be updated more often? every 4h or so? http://stats.best.eu.org/karamba-systeminfo/
i looked at the numbers from the 0.5.9 age, and they seem to be very similar, except that 0.5.10 seems to use more queries. while looking today with optimizeit i realized that the memory used is non-linear because resources are allocated for each query in each connection.
the number of connections open by 0.5.10 is higher probably because connections are kept for longer time, due to the new "contract" that every page executes with one connection (before, each group of mak:list in a page was using one connection)
i think the main reason is the second...
there are 78(!) connections made here http://stats.best.eu.org/karamba-systeminfo/2004-06-24@05:03.html
since lots of resources are allocated per connection, it may be that there is no leak, there simply isn't enough memory for the large number of connections allocated.
to check the number of connections historically, use this: tail -100000 tomcat/logs/server-output.txt | grep -1 createAndCount
possible solutions (bug 76):
a simpler solution:
Comment by @cristianbogdan on 10 Jul 2004 23:33 UTC i just put in production an experimental version that does not cache queries per connection, but per database. that is, no matter how many connections there are, the query cache will not grow. the cache will only grow when there are more pages accessed.
the number of cached queries seems to be important for the memory allocated. for example http://stats.best.eu.org/karamba-systeminfo/2004-06-23@19:03.html Heap in use: 161 MB version 0.5.10.4 Number of connections open 16 Database localhost_mysql_production per-connection query objects 5616 OQL parsed queries 1606
here are some other samples (compare ratios!)
http://stats.best.eu.org/karamba-systeminfo/2004-07-10@18:00.html version 0.5.10.5-fewConnections Number of connections open 10 Database localhost_mysql_production per-connection query objects 2120 OQL parsed queries 1354
http://stats.best.eu.org/karamba-systeminfo/2004-07-10@23:00.html version 0.5.10.5-fewConnections Number of connections open 12 Database localhost_mysql_production per-connection query objects 2795 OQL parsed queries 1477
the per-connection stuff is thus raising a lot in comparison with queries
with the new version:
http://stats.best.eu.org/karamba-systeminfo/2004-07-11@00:30.html version devel-20040710233447 Database localhost_mysql_production query objects 241 OQL parsed queries 612
=> there are fewer cached queries comparing to queries parsed. just over one third, comparing to over double before.
also interesting is that the first 600 queries are parsed in the first hour of server activity, while it takes many more hours (almost 1 day) to encounter (and parse) 800 more queries
Comment by @cristianbogdan on 21 Jul 2004 01:46 UTC closing this, as the soft-cache based code (which will be released as 0.5.11) seems to do pretty well
Reported by Vilius on 24 Jun 2004 08:54 UTC From: Priit Potter Sent: Monday, June 21, 2004 10:18 To: itc-install@best.eu.org Subject: Karamba crashes with new makumba version
Hi!
It is the second time after we put 0.5.10.4 into production that I notice karamba dying because of OutOfMemory error. I restarted it and everything works fine again.
Since I am extremely busy at work at the moment, I just send you the last 1000 lines from the server log, maybe it will help identify the problem.
priit
Migrated-From: http://trac.makumba.org/ticket/709