FirebirdSQL / firebird

Firebird server, client and tools
https://www.firebirdsql.org/
1.22k stars 214 forks source link

Severe Memory leak using substring in Stored Proc [CORE2596] #3006

Open firebird-automations opened 15 years ago

firebird-automations commented 15 years ago

Submitted by: Calin Pirtea (pcalin)

There seems to be a severe memmory leak in Firebird 2.1.2 and 2.1.3 RC2 related to using substring or conversion between blobs and varchar.

I'm using the latest Firebird 2.1.3.18185 and running the following script:

create or alter procedure test_crash2_FB2_1 as begin end;

create or alter procedure test_crash_FB2_1 as begin end;

recreate table tbl_test_crash_FB2_1 ( PK_Field integer not null primary key , test_memo blob (65535, 1));

create generator gen_test_crash_FB2_1;

set generator gen_test_crash_FB2_1 to 0;

commit;

set term ^;

create or alter procedure test_crash_FB2_1 as declare my_loop_count integer; begin my_loop_count = 100000; while (my_loop_count > 0) do begin insert into tbl_test_crash_FB2_1 (PK_Field, test_memo) values (gen_id(gen_test_crash_FB2_1, 1), 'asd fa dsf asdf ksdfjhgklsjhgfdklj shfdgkj hsjkgfd hklsjgfd h klsjhfdgkl sjhdfgkl jhskgfdj h');

my\_loop\_count = my\_loop\_count \- 1;

end end^

create or alter procedure test_crash2_FB2_1 as declare my_loop_count integer; declare my_test_memo blob (65535, 1); declare my_pk integer; declare my_test_varchar varchar(4096); begin execute procedure test_crash_FB2_1; my_loop_count = 100000; while (my_loop_count > 0) do begin for select PK_Field, test_memo from tbl_test_crash_FB2_1 into :my_pk, :my_test_memo do begin my_test_varchar = substring(:my_test_memo from 1 for 4096); end my_loop_count = my_loop_count - 1; end end^

commit^

set term ;^

execute procedure test_crash2_FB2_1;

commit;

I used a virtual machine with 128 MB RAM and 128 MB VirtualMemory. After a few seconds the server crashes. The production environment where this was reported has 4096MB Ram and 4096MB Virtual Memory and it takes over 8 hours to crash the server. We are in desperate need of a fix since one of our largest customer is experiencing this DoS. At this stage we have no workarounds for the problem.

Thank you in advance,

Calin Pirtea Communicare Systems Pty Ltd.

firebird-automations commented 15 years ago

Commented by: @hvlad

There is no memory leak. Each substring(blob...) creates a temporaty blob which is released only at transaction end. Each such blob occupied in memory 1 database page amount of bytes. Your code creates a lot of temporary blob's (100000). So, you need 400MB of memory just for this blobs (if your page size is 4KB).

In Firebird 2.5 temporary blobs stored on disk (in temporary files) so memory requirement are much lower.

In this case i see that engine could understand that temporary blob, created by substring, can be released immediately after implicit CAST to varchar.

As for FB 2.1 i recommend to lower amount of temporary blobs created within one transaction to reduce memory consumption.

firebird-automations commented 15 years ago

Commented by: Calin Pirtea (pcalin)

Hi Vlad,

I disagree with you because if I do not use substring the problem goes away.

IOW, my_test_varchar = :my_test_memo from 1 for 4096; works perfectly without leaking the blob. Not releasing a blob that is obviously not used anymore is called a memory leak no matter the excuse.

Cheers, Calin.

firebird-automations commented 15 years ago

Commented by: @asfernandes

You're right and wrong at same time.

You're right because if the temporary blob is not accessible anymore, it should not consume memory.

You're wrong because it's not really a leak, it's know and it goes away when you commit the transaction.

There is a way to fix that, using ref counting. Don't expect that happening soon (i.e., even in FB 3). To be done right, it would require some major changes on how descriptors are used internally.

In 2.5 things are much better using the temp space. However, AFAIR, it has not been a simple change, so I'm not sure it could be backported.

firebird-automations commented 15 years ago

Commented by: Calin Pirtea (pcalin)

Hi Adrian,

I agree with you. I was hoping it can be backported though.

Firbeird 1.5 works perfectly with this so it must be a bug introduced in 2.0 or 2.1. Currently I've got thousands of metadata lines to change to work around this problem because this is all code that worked in 1.5.

The major drawback is that I must change all this code when in fact it will work just fine in 2.5 (going on your words, I haven't tested it).

Can you recommend anything other than re-writing all my code?

Cheers, Calin.

firebird-automations commented 15 years ago

Commented by: @asfernandes

Humm, from this POV, this is a regression, and I don't see a way for you to not rearchitect your code. In 2.1 SUBSTRING returns a blob while in older versions it returns string.

I've verified the 2.5 fix (CORE1658), and it's not big code changes. But you know, there is always late fixes to complicate things.

So I'd say it's up to Dmitry decides if we backport it or no.

firebird-automations commented 15 years ago

Commented by: Calin Pirtea (pcalin)

Thanks Adriano,

It would mean a great deal if this fix could be incorporated in 2.1.3 or at least in a special build for us. Currently it would cost at least one week development effort to re-write our source code so any help would be greatly appreciated.

We are gold sponsors of Firebird for the second time and my boss keep asking me what do we get back from this... It would be great to show him real value for our sponsorship. ;)

I'll wait for Dmitry's response on this.

Cheers, Calin.

firebird-automations commented 15 years ago

Commented by: @hvlad

Calin,

> I disagree with you because if I do not use substring the problem goes away. So, don't use substring - what a problem ? :)

> Not releasing a blob that is obviously not used anymore is called a memory leak no matter the excuse. Its released - just commit transaction.

> Currently it would cost at least one week development effort to re-write our source code. Do want to know how many time it will cost to us to backport and test CORE1658 into 2.1 codebase ?

> We are gold sponsors of Firebird for the second time This is very appreciated. But you do it for yourself, please understand it

> boss keep asking me what do we get back from this Really ? Didn't you got Firebird for free for the few years already ?

I can't said right now if it will be backported into 2.1.4 or someone will create private build for you (try to contact IBP, for example). But i can say that it will not be backported into 2.1.3, too late.

firebird-automations commented 15 years ago

Commented by: @hvlad

Adriano,

I think there is also another way to address this concrete issue.

Make substring aware of destination type and re-implement it in the way to not create temp blob if destination is not a blob. Of course DSQL also must be aware of this change to generate correct descriptors and remove implicit CAST.

Not sure if we must to go this way ;)

firebird-automations commented 15 years ago

Commented by: Calin Pirtea (pcalin)

Hi Vlad,

Please be advised that this simple script causes Firebird 2.1 to crash causing denial of service. DoS.

The whole script can be re-written in a much simpler way and still cause Firebird to crash. I would have thought this is a serious error enough to diserve more attention that say "don't use substring". Any user can logon to the server on _any_ databases or even create a new one and still kill Firebird for all other users.

You don't want to call it memory leak, fine with me, however it is a serious defect.

Cheers, Calin.

firebird-automations commented 15 years ago

Commented by: @hvlad

Calin,

this issue was already addressed and fixed. Just use approproate Firebird version. You can't ? Follow advices you've got here. You don't want ? Sad for you, but what else do you want from us ?

And remember - this is not a support list. Therefore i will not comment your statements such as "Any user can logon to the server"...

firebird-automations commented 15 years ago

Commented by: Calin Pirtea (pcalin)

Hi Adriano, Vlad,

The sad news is that this bug is also present in current 2.5 beta. Just downloaded and tested. :(

So, Vlad, there is no other version of Firebird I can use.

This also means that it is slightly different than CORE1658.

All functions with similar functionality to substring (left, right...) are affected.

Cheers, Calin.

firebird-automations commented 15 years ago

Commented by: @hvlad

I was wrong. You created not 100000 temp blobs. You created 100000 * 100000 temp blobs. Is it the way your application really works ?

In v2.5 when temp blob is closed its buffer is really released. But small structure (128 bytes) stay in memory to not forget this blob completely. To complete your "real life" test Firebird need to allocate 128 * 10**10 bytes or more than 1TB.

I doubt your application behave the same way.

Few numbers. I changed your test_crash2_FB2_1 to make 1 iteration of while loop to create 100K temp blobs. FB 2.1.3 allocated 431 MB before commit and 14.4 MB after commit FB 2.5 allocated 44.4 MB before commit and 15.7 MB after commit

Do you see the difference ?

firebird-automations commented 15 years ago

Commented by: Calin Pirtea (pcalin)

Hi Vlad,

In Firebird 2.1 I can crash it with only 10 iterations. This means 1 million blobs.

There are various reporting requirements that can force us to navigate through over 10 million records and maybe manipulate more than 10 blobs each row in the run which means 100 million temp blobs. This is a reasonable small amount because all our databases are small (under 4GB).

These are real live patient electronic health records and reporting on them has to be done weekly. There is no workaround for us on something that worked perfectly in Firebird 1.5.

Currently we have no option to go back to Firebird 1.5 and apparently we have no option of going to Firebird 2.5 either.

Any manipulation of a blob, where the result is another blob, results in this memory allocation be it UDF or native Firebird function. This means in Firbeird 2 or greater we cannot manipulate blobs anymore.

The test I gave you was merely there to make it easy to reproduce the defect.

I can understand this is hard to fix in Firebird, however it is impossible to work around. :(

Cheers, Calin.

firebird-automations commented 15 years ago

Commented by: @dyemanov

I think CORE1658 should be backported into v2.1.4 anyway. But it's already obvious it won't help you much. Other solutions are perhaps possible but surely they're not going to appear immediately.

firebird-automations commented 15 years ago

Commented by: Calin Pirtea (pcalin)

Thanks Dmitry,

We are going through our SQL rewriting it to use only varchar and no blobs. It is however a big problem when using blobs in Firebird. There will be cases where varchars cannot be used...

Cheers, Calin.

firebird-automations commented 15 years ago

Commented by: Calin Pirtea (pcalin)

Hi Vlad, Adriano,

I was wrong when I said this worked in Firebird 1.5. Firebird 1.5 crashes with "internal error". We must recently had just enough extra data to make it tip over.

IOW this problem has always been around and there are no workarounds at all.

Cheers, Calin.

firebird-automations commented 14 years ago

Commented by: @asfernandes

Fix for CORE1658 is backported to 2.1.4.

I'm lowering the priority of this ticker from Blocker to Major.

firebird-automations commented 14 years ago
Modified by: @asfernandes priority: Blocker \[ 1 \] =\> Major \[ 3 \]
firebird-automations commented 14 years ago
Modified by: @dyemanov Fix Version: 3\.0 Alpha 1 \[ 10331 \]
firebird-automations commented 11 years ago
Modified by: @dyemanov Fix Version: 3\.0 Beta 1 \[ 10332 \] Fix Version: 3\.0 Alpha 1 \[ 10331 \] =\>
firebird-automations commented 10 years ago
Modified by: @dyemanov Fix Version: 3\.0 Beta 2 \[ 10586 \] Fix Version: 3\.0 Beta 1 \[ 10332 \] =\>
firebird-automations commented 9 years ago
Modified by: @dyemanov Fix Version: 3\.0 Beta 2 \[ 10586 \] =\>