Closed GoogleCodeExporter closed 9 years ago
My experience is the same.. we fill the database programatically from our
primary data storage, which is SVN.
The PageStore DB was 27.9 GB big, while the MVStore uses more than 300 GB.
Moreover, when shutting down the system and doing full compact of the database,
the DB finally grew to more than 470 GB which resulted in occupying all the
available disk space.
We used H2 initially our own compiled version from sources of 2014-06-12 and
this week we switched to 1.4.179. Tests were done on CentOS 6.4 x64.
I also did local tests with another repository on my Windows 7 x64 machine and
this time the Page store was 414 MB (stayed 414 MB after compacting), while
MVStore used about 1GB after filling the DB and grew to 1.9 GB after compacting.
Original comment by radek.kr...@gmail.com
on 24 Jun 2014 at 8:57
How did you compact, did you use "shutdown compact" or "shutdown defrag"?
Currently, with version 1.4.179, the statement "shutdown defrag" should reduce
the database file size. If it doesn't possibly the problem is LOB data
(repeated binaries). How large is the file if you compress the database (using
zip)? Could you send me the (compressed) database file?
Original comment by thomas.t...@gmail.com
on 24 Jun 2014 at 9:05
We use "shutdown compact" command to reduce the DB size. I'm gonna do some
further tests using "shutdown defrag" later this week.
The compressed 300 GB file was about 41 GB (which is still much moree than the
PageStore).
With the other repository I used locally, the compacted MVStore was 352 MB (1.9
GB originally). Just out of curiosity, I compacted also the PageStore resulting
in 34MB archive (414MB originally). Both repositories are unfortunately
production ones, so I cannot share those. But I'll try to reproduce the
problems on some test data and provide some test DB later.
Original comment by radek.kr...@gmail.com
on 24 Jun 2014 at 9:22
[deleted comment]
[deleted comment]
[Sorry, I originally wrote "shutdown compact" below, now fixed; and another
typo fixed now]
OK. The file size issue is a high priority for me. Only "shutdown defrag" will
*really* reduce the file size currently (as of 1.4.179), and "shutdown compact"
will not help much (only about 16 MB at a time). Could you tell me:
- how large is the database file before running "shutdown defrag",
both real size and compressed size (with zip compression)
- how long did it take to run "shutdown defrag"
- how large is the database file after running "shutdown defrag",
both real size and compressed size (with zip compression)
This data alone will help quite a lot to set priorities.
Original comment by thomas.t...@gmail.com
on 24 Jun 2014 at 9:32
Another issue I need to work on is to not let the database file get that big of
course. But performance and effectiveness of closing a database, "shutdown
compact" and "shutdown defrag" are also important.
Original comment by thomas.t...@gmail.com
on 24 Jun 2014 at 9:34
OK. Did some more tests and measurements on my local machine(Windows 7 x64,
Core i7, 8GB RAM, SDD) with the smaller repository. It seems that the defrag
works fine, just that the space is not reused properly during runtime.
MVStore before defrag is 1.11 GB (199 MB zipped).
MVStore after defrag is 339 MB (49 MB zipped).
Shutdown of our whole platform took about 15s including the DB defrag, so my
guess would be that the defrag did not take more than 10 seconds.
Original comment by radek.kr...@gmail.com
on 24 Jun 2014 at 1:27
> It seems that the defrag works fine, just that the space is not reused
properly during runtime.
Yes. For version 1.4.179, I mainly worked on "shutdown defrag". I think it's
now relatively fast (I hope to make it twice as fast, but that's not urgent I
guess).
More important right now is that disk space is re-used earlier while running
the process. I do like to keep the 45 seconds delay to re-used disk space,
(even thought, that might need to be changed at some point as well), to ensure
a database can't get corrupt after a power failure. But I think the problem is
that chunks with usage of less than about 10% should be re-used a lot earlier.
There seem to be far too many chunks with 1-5% usage during normal operation.
> MVStore after defrag is 339 MB (49 MB zipped).
That might mean binaries are stored multiple times. Do you store a lot of
(similar) binaries?
Original comment by thomas.t...@gmail.com
on 24 Jun 2014 at 1:41
My guess is that the high compression ratio is caused by the number of
redundant varchars in our DB, e.g. our biggest table in production repository
has some 13 mio records and contains 8 columns that store URIs of objects which
are typically quite similar and 70-80 characters long.
btw. These numbers "339 MB (49 MB zipped)" are similar to PageStore on the same
repository.
At the moment I'm trying to defrag the production database on a test system, so
will let you know these results later.
Original comment by radek.kr...@gmail.com
on 25 Jun 2014 at 7:51
Here are the measurements from my test system with production data (CentOS 6.4
x64, Core i7 920, 12GB RAM, SW RAID). These are even more interesting.
MVStore before defrag is 302.4 GB (39.7 GB zipped).
MVStore after defrag is 20.6 GB (2.3 GB zipped).
DB defrag took 257 minutes. The HDDs on the server are quite slow and there's
not much RAM left for OS caches, so this defrag performance can be expected.
Original comment by radek.kr...@gmail.com
on 25 Jun 2014 at 12:12
New measurements with version 1.4.180.
MVStore before defrag is 128.24 GB (16.47 GB zipped).
MVStore after defrag is 17.43 GB (1.44 GB zipped).
DB defrag of this database took 37 minutes.
Original comment by radek.kr...@gmail.com
on 23 Jul 2014 at 1:25
I think this is fixed now (as of today). Could you test with the very latest
trunk version of H2? This is http://www.h2database.com/automated/h2-latest.jar
- or build H2 yourself from the latest source code.
Original comment by thomas.t...@gmail.com
on 23 Jul 2014 at 6:59
Please note the latest (snapshot) version is not yet stable. I'm working on
that.
Original comment by thomas.t...@gmail.com
on 24 Jul 2014 at 10:18
It seems the issue is not yet fixed: I tried with today's "h2-latest.jar",
running a 60MB database create script:
java -cp .;h2-latest.jar org.h2.tools.RunScript -url
jdbc:h2:./oldFormat;MV_STORE=FALSE;MVCC=FALSE -script setup-database.sql
... resulted in a 41MB file oldFormat.h2.db
java -cp .;h2-latest.jar org.h2.tools.RunScript -url jdbc:h2:./newFormat
-script setup-database.sql
... resulted in a 317MB file newFormat.mv.db
Should I upload the setup-database.sql file (6MB 7zipped)?
Original comment by u.g.diet...@gmail.com
on 21 Sep 2014 at 6:00
> Should I upload the setup-database.sql file (6MB 7zipped)?
Yes, that would be nice. If uploading fails, send it to me via email.
Original comment by thomas.t...@gmail.com
on 21 Sep 2014 at 7:17
Attached the database setup file for the example above, run on Windows 7 / Java
1.7.0_67.
Original comment by u.g.diet...@gmail.com
on 21 Sep 2014 at 10:24
Attachments:
Any news here? We run into the same problem.
Inserting 1 mio entries resulted in a db file with a size of 2.6GB.
Exporting that table into a CSV (using CSVWriter) resulted in a file with about
40MB.
Any ideas how to solve that issue? Is this related?
Original comment by js.cedar...@gmail.com
on 2 Dec 2014 at 3:16
I'm still working on this issue. One problem is indexes. If you create an index
before adding data, then more disk space is used. Could you check if creating
the indexes later on saves space? Of course it shouldn't matter much when
indexes are created, and I hope in the near future I will be able to solve this
problem. But right now, this is a possible workaround.
Original comment by thomas.t...@gmail.com
on 2 Dec 2014 at 3:27
We played around. One thing that works for us: We disabled the MVStore...
Original comment by js.cedar...@gmail.com
on 3 Dec 2014 at 11:06
As spatial index has been solved. We are waiting the next release of H2 to give
new feedback about MVStore.
Original comment by nico.de...@gmail.com
on 3 Dec 2014 at 4:08
I consider this problem fixed now with version 1.4.184, even thought there is
still room for improvement.
Original comment by thomas.t...@gmail.com
on 14 Jan 2015 at 6:51
Re-checked with h2-1.4.185 and the database setup file from above:
* old format (with ;MV_STORE=FALSE;MVCC=FALSE) size is 40MB (as before)
* new format size is 86MB (much better, yet still not quite satisfactory)
* new format, create indexes *after* inserting the data, size is 45MB (that is
good)
* new format, create indexes, then insert data, then drop indexes and re-create
them, size is 115MB (that's ugly)
* new format, create indexes, then insert data, then drop indexes and re-create
them, the call SHUTDOWN COMPACT: size is 141MB (that's worse)
The bottom line is, I'll stick to the old format.
Original comment by u.g.diet...@gmail.com
on 9 Feb 2015 at 7:47
Hi
The problem might be the high retention time. How fast is your test case? The
default retention time is 45 seconds, that means space is only re-used if it
was unused for 45 seconds. The retention time can be changed, see
http://h2database.com/html/grammar.html#set_retention_time
If this is not the problem, could you post a test case please?
Original comment by thomas.t...@gmail.com
on 12 Feb 2015 at 6:52
Original issue reported on code.google.com by
nico.de...@gmail.com
on 22 Jan 2014 at 4:40