Open firebird-automations opened 13 years ago
Commented by: @dyemanov
I removed the over-longish firebird.log contents, please attach it as a separate file.
Commented by: @dyemanov
What exactly v2.5.1 build do you use? Also, what platform (win32 / win64)?
All the errors in the log are related to the out-of-memory condition, the server process is out of virtual memory. I suppose this is the 32-bit build and you experience some kind of memory leak. What FB version did you run before trying v2.5.1?
Commented by: Andre van Zuydam (andrevanzuydam)
Log file with the errors before a crash of Firebird
Commented by: Andre van Zuydam (andrevanzuydam)
We have now tested with Super Classic version of Firebird and the memory on the system is being released correctly! How do we debug memory leaks on Firebird Super Server ?
Commented by: @dyemanov
Have you looked into the MON$MEMORY_USAGE table?
Commented by: Andre van Zuydam (andrevanzuydam)
Ok, we've done some extensive testing and can duplicate the problem now at whim,
The problem happens when two client application connect to the Super Server concurrently. As each client is connected a small increase in memory occurs on the fbserver.exe in Task Manager. (Over a week this produces a crash)
I'm sure this will not happen in standard circumstances but we have a service which polls the database engine every 10 seconds and it is this service that is causing a build up of memory once another connection happens. If the service is running by itself then it would be happy indefinitely. If a local client to the service or a remote client connects, Firebird starts building up memory.
Assuming these clients stay connected for a long time or perform reconnections to the database what we find is that the memory is freed up but about 100 - 200K memory always stays occupied. Only once all the client applications have disconnected does the Firebird Server go back to its normal memory state (about 4600K on our system). As long as two of the clients remains connected the memory builds up, all clients must then disconnect and Firebird memory goes back to normal.
If only one client is connected the server is stable and memory does not increase. Does this sound like a shared memory problem ? Super Classic does not have any of these draw backs and behaves correctly.
What can I do to help debug this ?
Commented by: @dyemanov
Thanks for the information, hopefully it will help us to find this memory leak. I will report back if more input would be required from your side.
Commented by: @dyemanov
While I'm searching for the possible memory leak, could you please re-try SuperServer and monitor the OST (oldest snapshot transaction) counter with gstat -h -- whether it gets stuck or not.
Commented by: Andre van Zuydam (andrevanzuydam)
Hi Dmitry, sorry for the delay in posting, here is a sample of the gstat -h
What exactly should I be looking for here?
Database header page information: Flags 0 Checksum 12345 Generation 59776 Page size 4096 ODS version 11.2 Oldest transaction 57559 Oldest active 57592 Oldest snapshot 57592 Next transaction 57597 Bumped transaction 1 Sequence number 0 Next attachment ID 2177 Implementation ID 16 Shadow count 0 Page buffers 0 Next header page 0 Database dialect 3 Creation date Apr 12, 2011 15:25:57 Attributes force write
Variable header data:
Sweep interval: 20000
\*END\*
I'm getting a lot of INET/inet_error: read errno = 10054 in my logs which I do not think is network hardware related, after this happens the clients disconnect off the database and we have to restart the engine. Is this something that I can prevent or is this a bug ?
Commented by: Andre van Zuydam (andrevanzuydam)
Another log, super server version at another site, memory is building up at a regular pace, about 100KB per transaction, only occasionally seems to free up some, 2 days later Firebird is now using 245MB of RAM, 6 clients connected permanently 24 X 7.
Database "----.FDB" Database header page information: Flags 0 Checksum 12345 Generation 135053 Page size 4096 ODS version 11.2 Oldest transaction 78088 Oldest active 120709 Oldest snapshot 96910 Next transaction 121275 Bumped transaction 1 Sequence number 0 Next attachment ID 34393 Implementation ID 16 Shadow count 0 Page buffers 0 Next header page 0 Database dialect 3 Creation date May 4, 2011 15:40:28 Attributes force write
Variable header data:
Sweep interval: 20000
\*END\*
Commented by: @hvlad
Transactions management is far from perfect. Are your transactions performs many INSERT, UPDATE or DELETE operations ? Do you see same memory consumption if you set GCPolicy = cooperative ?
Commented by: Andre van Zuydam (andrevanzuydam)
Hi Vlad
I've set the cooperative policy on, I suppose this is how Classic server runs? We do get more performance out of Super Server though and this is why we want to run this. We do perform many inserts while operating as our system is transactional in terms of how the data is stored, updates are very few, delete operations are limited to archiving of a single table to another database. We are using a stored proc to connect to the other database to send the data, could this be where the memory is leaking ?
Some other things we have tested is that a normal RAM cleaner app will bring the memory use down on the Firebird server, this is not ideal.
I am also open to poor programming on my side, how can I test if my transactions are really getting closed ? I definitely call close transaction after I do a query and statement, there is something that bothered me on Firebird 2.5, some of the transactions I opened reported a 501 error of attempting to close an already closed cursor which the same code / client did not report in 2.1, I changed my transaction closing method to use the DSQL_UNPREPARE from a DSQL_DROP or DSQL_CLOSE parameter which "seemed" to fix this problem.
These transactions which returned cursor errors were update or execute statements for stored procs which in most cases do not return results, I had similar problems with update insert statements with returning values, something definitely changed in the client after 2.1 which started this. Perhaps there is a simple explanation for these changes which will allow me to correct my code too ?
Thank you for your help so far.
Commented by: Andre van Zuydam (andrevanzuydam)
The cooperative policy on Firebird Super Server does not seem to work as the memory is still building on Super Server. Classic server works perfectly I might add and is still stable.
Commented by: @hvlad
> I've set the cooperative policy on,
Are you restarted Firebird after edit of firebird.conf ?
> I suppose this is how Classic server runs?
Not exactly. It disabled background garbage collection and corresponding in-memory structures. As you have stuck OST number, these in-memory structures are not cleaned up. Suggestion to switch to cooperative gcpolicy was given to confirm this idea.
> We are using a stored proc to connect to the other database to send the data, could this be where the memory is leaking ?
I doubt it
> Some other things we have tested is that a normal RAM cleaner app will bring the memory use down on the Firebird server, this is not ideal.
RAM cleaner on database server machine ? Is it joke ?
> I am also open to poor programming on my side, how can I test if my transactions are really getting closed ?
At the program side you can ensure that transaction handle becames zero. Or you could inspect MON$ tables to see not released transactions and\or statements. Also you could try Trace API and see all interesting events at server side.
> I definitely call close transaction after I do a query and statement, there is something that bothered me on Firebird 2.5, some of the transactions I opened reported a 501 error of attempting to close an already closed cursor which the same code / client did not report in 2.1, I changed my transaction closing method to use the DSQL_DROP which "seemed" to fix this problem. > > These transactions which returned cursor errors were update or execute statements for stored procs which in most cases do not return results, I had similar problems with update insert statements with returning values, something definitely changed in the client after 2.1 which started this. Perhaps there is a simple explanation for these changes which will allow me to correct my code too ?
Sure. In v2.5 it is not allowed to close cursor when you have no cursor :) This is exactly your case : nor UPDATE, nor EXECUTE PROCEDURE doesn't returns cursor. But this shouldn't affect transaction state, except of your code flow is not called commit (or rollback) after such error.
BTW, DSQL_DROP is NOT a "transaction closing method". This is option of *statement* close.
> The cooperative policy on Firebird Super Server does not seem to work as the memory is still building on Super Server.
Again, are you restarted Firebird after edit of firebird.conf ?
> Classic server works perfectly I might add and is still stable.
Commented by: Andre van Zuydam (andrevanzuydam)
Hi Vlad
Definitely restarting Firebird on each config change, memory still building, (it is very small 100K fore each instance that gets run). After a week the server will crash or not respond.
The RAM cleaner on the database server was not a joke, only a test to see if the memory was still being accessed, unfortunately the machines we deploy on are not dedicated servers, we may not have come across this problem on a dedicated server. I must add that I do not have this problem on a Linux server, so must be a windows thing ?
Thank you for your replies with regard to the coding, and again, always restarting Firebird when making conf changes. I have also tried running Super Server without Guardian just as a extra test, to no avail.
My next resort is to build a standalone exe to replicate the problem, I think this is something that will help troubleshoot this, so please wait for this as I need to simulate what is happening and then we can work from there ?
Commented by: @dyemanov
Please test the next (tomorrow's) snapshot build. It will have CORE3533 fixed and it could be related to your case.
Submitted by: Andre van Zuydam (andrevanzuydam)
Attachments: firebird.log
Votes: 1
We have a test system running on 2.5.1 snapshot, each week (plus minus 7 - 8 days) apart we have to manually restart / initialize the Firebird service which crashes out.
Looking into the logs points to a possible network problem which we have had a look at, unfortunately I get INET/inet_error: read errno = 10054 on local host development machine every time I work.
One possible cause is perhaps the sweep that failed ? The five clients were connected but could not query the server after such an incident which also makes the problem strange.
The logs of the past week since the last crash and today are below.