FirebirdSQL / firebird

Firebird server, client and tools
https://www.firebirdsql.org/
1.24k stars 213 forks source link

Database file grows when Blob variables used in Stored Procedure [CORE5114] #5398

Open firebird-automations opened 8 years ago

firebird-automations commented 8 years ago

Submitted by: Holger Klemt (klemmo)

Attachments: db1.fbk

Votes: 3

Today, I demonstrated to some customer on our Firbeird softwaretraining held in munich some technologies they should use to improve their work and some they should avoid. I know that using blob as variables or params inside a stored procedure have some really big disadvantages, but i was not aware of some part of the result.

the facts: We used the ibexpert demo database and created 10000 records as testdata in table products. To demonstrate the bad way, we added this stored proc: using blob params and manipulate them for every found record. This is slow and needs a lot of memory, basic reasons for this problems are known from my side. Procedure execution time was around 45 seconds even not a single record should be written to disk.

The really bad thing that right after executing this stored proc, the database filesize changed from about 25MB to more than 3GB. The SP does not a single update or insert .... You can also see from statistics that the engine wrote about 190000 pages.

create or alter procedure GETPRODUCTS returns ( RES blob sub_type 1 segment size 80) as declare variable ACTOR blob sub_type 1 segment size 80; begin res=''; for select product.actor from product into :actor do begin res=res||' '||actor; end suspend; end

------------------------------------------------ Prepare : 16,00 ms Execute : 44.906,00 ms Avg fetch time: 44.906,00 ms

Memory ------------------------------------------------ Current: 96.756.864 Max : 106.806.792 Buffers: 5.000

Operations ------------------------------------------------ Read : 85 Writes : 189.558 Fetches: 799.256 Marks : 583.503

Enchanced Info: +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ | Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges | | | Total | reads | reads | | | | | | | +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ |PRODUCT | 0 | 0 | 10000 | 0 | 0 | 0 | 0 | 0 | 0 | +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+

Workaround for users

This is a procedure i demonstrated as a workaround, which does basically the same, but collects the data using varchar variables as log as it fits and only copy them to the result when there is no more free space in varchar. The execution takes only 350 ms and does not affect the size of the database, since it does not a single page write (as expected).

Basically the first SP is definitively a valid way to write an SP, but there must be something very strange going on inside Firebird.

The test was made with a firebird 255 on windows 64 and also with firebird 3 rc1, both had exactly the same strange behaviour extending the size of the database to 3GB without any real acceptable reason.

create or alter procedure GETPRODUCTS returns ( RES blob sub_type 1 segment size 80) as declare variable ACTOR varchar(200); declare variable Z varchar(32000); begin res=''; z=''; for select product.actor from product into :actor do begin z=z||' '||actor; if (char_length(z)>31000) then begin res=res||z; z=''; end end if (z<>'') then begin res=res||z; z=''; end suspend; end

Query Time ------------------------------------------------ Prepare : 16,00 ms Execute : 359,00 ms Avg fetch time: 359,00 ms

Memory ------------------------------------------------ Current: 87.044.576 Max : 87.230.256 Buffers: 5.000

Operations ------------------------------------------------ Read : 79 Writes : 0 Fetches: 20.751 Marks : 365

Enchanced Info: +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ | Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges | | | Total | reads | reads | | | | | | | +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+ |PRODUCT | 0 | 0 | 10000 | 0 | 0 | 0 | 0 | 0 | 0 | +-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+

firebird-automations commented 8 years ago

Commented by: Holger Klemt (klemmo)

attached the fb25 backup file i used for the test

firebird-automations commented 8 years ago
Modified by: Holger Klemt (klemmo) Attachment: db1\.fbk \[ 12905 \]
firebird-automations commented 8 years ago

Commented by: Sean Leyne (seanleyne)

Holger,

IIRC, blobs don't really have an in-memory only representation, they always have an on-disk footprint. Further, when a blob is revised, a new object is actually allocated, the existing object is never re-used. The result is that each intermediate blob which is created, currently, is written to disk. Which is why the database size explodes.

The 2nd SP does not change the database size, since no single return value ever got larger than VARCHAR(32000), so no intermediate Blob page was created. (the engine recognized that a VARCHAR converted to a blob does not need intermediate storage since it has a max size)

firebird-automations commented 8 years ago
Modified by: Sean Leyne (seanleyne) description: today i demonstrated to some customer on our Firbeird softwaretraining held in munich some technologies they should use to improve their work and some they should avoid\. I know that using blob as variables or params inside a stored procedure have some really big disadvantages, but i was not aware of some part of the result\. the facts: We used the ibexpert demo database and created 10000 records as testdata in table products\. To demonstrate the bad way, we added this stored proc: using blob params and manipulate them for every found record\. This is slow and needs a lot of memry, basic reasons for this problems are know from my side Procedure execution time was around 45 seconds even not a single record should be written to disk\. The really bad thing that right after executing this stored proc, the database filesize changed from about 25MB to more than 3GB\. The SP does not a single update or insert \.\.\.\. You can also see from statistics that firebird wrote about 190000 pages\. create or alter procedure GETPRODUCTS returns \( RES blob sub\_type 1 segment size 80\) as declare variable ACTOR blob sub\_type 1 segment size 80; begin res=''; for select product\.actor from product into :actor do begin res=res\|\|' '\|\|actor; end suspend; end \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Prepare : 16,00 ms Execute : 44\.906,00 ms Avg fetch time: 44\.906,00 ms Memory \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Current: 96\.756\.864 Max : 106\.806\.792 Buffers: 5\.000 Operations \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Read : 85 Writes : 189\.558 Fetches: 799\.256 Marks : 583\.503 Enchanced Info: \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ \| Table Name \| Records \| Indexed \| Non\-Indexed \| Updates \| Deletes \| Inserts \| Backouts \| Purges \| Expunges \| \| \| Total \| reads \| reads \| \| \| \| \| \| \| \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ \|PRODUCT \| 0 \| 0 \| 10000 \| 0 \| 0 \| 0 \| 0 \| 0 \| 0 \| \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ Workaround for users this is a procedure i demonstrated as a workaround, which does basically the same, but collects the data using varchar variables as log as it fits and only copy them to the result when there is no more free space in varchar\. The execution takes only 350 ms and does not affect the size of the database, since it does not a single page write \(as expected\)\. basically the first stored proc is definitly a valid way to write a sp, but there must be something very strange going on inside Firebird\. The test was made with a firebird 255 on windows 64 and also with firebird 3 rc1, both had exactly the same strange behaviour extending the size of the database to 3GB without any real acceptable reason\. create or alter procedure GETPRODUCTS returns \( RES blob sub\_type 1 segment size 80\) as declare variable ACTOR varchar\(200\); declare variable Z varchar\(32000\); begin res=''; z=''; for select product\.actor from product into :actor do begin z=z\|\|' '\|\|actor; if \(char\_length\(z\)\>31000\) then begin res=res\|\|z; z=''; end end if \(z<\>''\) then begin res=res\|\|z; z=''; end suspend; end Query Time \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Prepare : 16,00 ms Execute : 359,00 ms Avg fetch time: 359,00 ms Memory \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Current: 87\.044\.576 Max : 87\.230\.256 Buffers: 5\.000 Operations \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Read : 79 Writes : 0 Fetches: 20\.751 Marks : 365 Enchanced Info: \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ \| Table Name \| Records \| Indexed \| Non\-Indexed \| Updates \| Deletes \| Inserts \| Backouts \| Purges \| Expunges \| \| \| Total \| reads \| reads \| \| \| \| \| \| \| \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ \|PRODUCT \| 0 \| 0 \| 10000 \| 0 \| 0 \| 0 \| 0 \| 0 \| 0 \| \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ =\> Today, I demonstrated to some customer on our Firbeird softwaretraining held in munich some technologies they should use to improve their work and some they should avoid\. I know that using blob as variables or params inside a stored procedure have some really big disadvantages, but i was not aware of some part of the result\. the facts: We used the ibexpert demo database and created 10000 records as testdata in table products\. To demonstrate the bad way, we added this stored proc: using blob params and manipulate them for every found record\. This is slow and needs a lot of memory, basic reasons for this problems are known from my side\. Procedure execution time was around 45 seconds even not a single record should be written to disk\. The really bad thing that right after executing this stored proc, the database filesize changed from about 25MB to more than 3GB\. The SP does not a single update or insert \.\.\.\. You can also see from statistics that the engine wrote about 190000 pages\. create or alter procedure GETPRODUCTS returns \( RES blob sub\_type 1 segment size 80\) as declare variable ACTOR blob sub\_type 1 segment size 80; begin res=''; for select product\.actor from product into :actor do begin res=res\|\|' '\|\|actor; end suspend; end \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Prepare : 16,00 ms Execute : 44\.906,00 ms Avg fetch time: 44\.906,00 ms Memory \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Current: 96\.756\.864 Max : 106\.806\.792 Buffers: 5\.000 Operations \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Read : 85 Writes : 189\.558 Fetches: 799\.256 Marks : 583\.503 Enchanced Info: \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ \| Table Name \| Records \| Indexed \| Non\-Indexed \| Updates \| Deletes \| Inserts \| Backouts \| Purges \| Expunges \| \| \| Total \| reads \| reads \| \| \| \| \| \| \| \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ \|PRODUCT \| 0 \| 0 \| 10000 \| 0 \| 0 \| 0 \| 0 \| 0 \| 0 \| \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ Workaround for users This is a procedure i demonstrated as a workaround, which does basically the same, but collects the data using varchar variables as log as it fits and only copy them to the result when there is no more free space in varchar\. The execution takes only 350 ms and does not affect the size of the database, since it does not a single page write \(as expected\)\. Basically the first SP is definitively a valid way to write an SP, but there must be something very strange going on inside Firebird\. The test was made with a firebird 255 on windows 64 and also with firebird 3 rc1, both had exactly the same strange behaviour extending the size of the database to 3GB without any real acceptable reason\. create or alter procedure GETPRODUCTS returns \( RES blob sub\_type 1 segment size 80\) as declare variable ACTOR varchar\(200\); declare variable Z varchar\(32000\); begin res=''; z=''; for select product\.actor from product into :actor do begin z=z\|\|' '\|\|actor; if \(char\_length\(z\)\>31000\) then begin res=res\|\|z; z=''; end end if \(z<\>''\) then begin res=res\|\|z; z=''; end suspend; end Query Time \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Prepare : 16,00 ms Execute : 359,00 ms Avg fetch time: 359,00 ms Memory \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Current: 87\.044\.576 Max : 87\.230\.256 Buffers: 5\.000 Operations \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Read : 79 Writes : 0 Fetches: 20\.751 Marks : 365 Enchanced Info: \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ \| Table Name \| Records \| Indexed \| Non\-Indexed \| Updates \| Deletes \| Inserts \| Backouts \| Purges \| Expunges \| \| \| Total \| reads \| reads \| \| \| \| \| \| \| \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ \|PRODUCT \| 0 \| 0 \| 10000 \| 0 \| 0 \| 0 \| 0 \| 0 \| 0 \| \+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+\-\-\-\-\-\-\-\-\-\-\+ summary: Strange Stored Proc Behaviour with Blob Blob Variables =\> Database file grows when Blob variables used in Stored Procedure
firebird-automations commented 8 years ago

Commented by: @asfernandes

We need something like Oracle's DBMS_LOB (https://docs.oracle.com/cd/B28359_01/appdev.111/b28419/d_lob.htm) to manipulate blobs (read, write functions) by blob id.

Fortunately, it's possible to write these function with the new external routines API.

Blob garbage collection should also be implemented, but it will never be fast than a good use of the blob api.

firebird-automations commented 8 years ago

Commented by: @sim1984

Holger Klemt,

When using the LIST function DB file will not grow so fast.

create or alter procedure GETPRODUCTS returns ( RES blob sub_type 1 segment size 80) as begin select LIST(product.actor, ' ') from product into :RES; suspend; end

firebird-automations commented 8 years ago

Commented by: Holger Klemt (klemmo)

yes, i know, list would be one way for a workaround, but when you generate for example csv, json or xml stuff using a stored procedure, you need some more logic. My workaround using varchars and put only intermediate results into the blob also works fine

i would perhaps vote for a kind of new memory memory blob variable which allow sub_type text operations and could be used to copy final results to a standard blob. This could be perhaps a good solution for similar problem, when this new memory blob variable always using the current instance insted of creating new versions with every change on disk. to avoid problems it could be defined in a varchar like syntax that simply allows for example also more than 32k of data, for example

declare variable mblob(100000) to have space for 100000 characters.

This is just an idea

NilsHaase commented 9 months ago

We have also found the phenomenon described by Mr. Klemt in the current version.

We create EDI data directly in the database and need a bit of logic for this. With a pure Varchar field, we would not be able to cope with the character limits. That's why we switched directly to the blobs.

The solution is now different. If something improves here in the future, we would be very pleased.

hvlad commented 9 months ago

Consider to use new BLOB_APPEND function (since v4.0.2) https://firebirdsql.org/file/documentation/release_notes/html/en/4_0/rlsnotes40.html#rnfb40-dml-new-blob-append

and\or new BLOB_UTIL package (since v5) https://firebirdsql.org/file/documentation/release_notes/html/en/5_0/rlsnotes50.html#rnfb50-engine-blob-util

NilsHaase commented 9 months ago

ok thanks I will try it out.