commandos59 / h2database

Automatically exported from code.google.com/p/h2database
2 stars 0 forks source link

File id mismatch problem (maybe Corruption) #39

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I have about ~700 installation of my program using h2 engine (on remote
kiosk PC) and regularly (but not too often, about 2-7 times a month)
getting some corruption of unknown cause or cryptography failure (on XTEA).
Versions tried 1.0.77, 1.0.79, 1.1.101 (last one only when opening)

Errors:
1.
org.h2.jdbc.JdbcSQLException: General error: java.lang.RuntimeException:
File ID mismatch got=-1373246757 expected=17 pos=3904 true
org.h2.store.DiskFile:/easysoft/.private/31100.data.db
blockCount:1926853611; SQL statement:
CREATE PRIMARY KEY ON PUBLIC.BILL(ID) [50000-74]

2.
java.lang.RuntimeException: Unexpected code path [50000-101]
at org.h2.store.DiskFile.init(DiskFile.java:415)

Other errors are like above.

I've investigated that (on 1.0.79 and up for sure) database got repaired
when connection is closed and reopened. But this procedure sometime
problematic to do when connections are managed by some third party
connection management (for example JPA implementation or just some pool).

So I expect maybe this bug (if it is bug) get fixed or probably database
can be repaired when opening first connection (not by reopening).
One such database attached to Issue

Attached Database use connection parameter ';CIPHER=XTEA' username 'sa'
and password '31100 31100'

Original issue reported on code.google.com by e.lucash@gmail.com on 20 Oct 2008 at 5:41

GoogleCodeExporter commented 8 years ago
Hi,

The database file contains what looks like random data at one position.

I am sorry to say that, but it looks like a corruption problem. I am very 
interested
in analyzing and solving this problem. Corruption problems have top priority 
for me.
I have a few question:

- Did you use multiple connections?
- Do you use any settings or special features (for example, the setting LOG=0, 
  or two phase commit, linked tables, cache settings)?
- Is the application multi-threaded?
- What operating system, file system, and virtual machine 
  (java -version) do you use?
- Is the database usually closed normally, or is process terminated forcefully 
  or the computer switched off?
- Is it possible to reproduce this problem using a fresh database 
  (sometimes, or always)?

Regards,
Thomas

Original comment by thomas.t...@gmail.com on 21 Oct 2008 at 7:38

GoogleCodeExporter commented 8 years ago
I'll try to describe my environment as details as I can.
Linux 2.6.19-gentoo-r5
JRE 1.6.0_05-b13

Only connection string parameter is CIPHER=XTEA
jdbc:h2:/somepath/00000;CIPHER=XTEA

Database was accessed using OpenJPA 1.0.2. Connection managed by this jpa
implementation but with following hints 
openjpa.ConnectionRetainMode=always
assuming that it is Single Connection mode. Transaction was set to resource 
local,
There is no two phase commit, no specific cache setting. I'm not sure what is 
linked
tables (or is it just relational tables which can be joined on keys), example of
schema can be investigated in attached sample database.

I've tried to route all data access thought single thread executor. But at 
least at
initialization connection created in thread 'main'. And Connection got closed in
shutdown hook (where EntityManger and EntityMangerFactory closed).

Kiosk PC are running 24/7(with jvm restart once at night) and equiped with UPS 
so
power-off unlikely, but possible.

Database created once on installation, and OpenJPA updates or creates (when not
exist) schema on initialization.

Such database condition (or corruption to be clear) happened at least 11-13 
times
during last 3 month. Kiosk PCs use slow GPRS connection so we cannot upload such
databases. Database in attach was first that technical supporters (from 
different
city) sent me on my request. Program errors was observed via central 
monitoring, but
just couldn't get corrupted files til now. I was never able (nor QA team) to
reproduce corruption during test runs or running nearby installed systems.

I prepare new version of application where I rewrote data access using plain 
jdbc, 
org.h2.jdbcx pool and fully synchronized DAOs with resource local transactions, 
all
using serializable isolation. And as workaround I open connection and close it
immediately at initialization to trigger repair. So I expect more insight on 
problem
during next month when new version got deployed on those ~700 machines.

Original comment by e.lucash@gmail.com on 22 Oct 2008 at 8:28

GoogleCodeExporter commented 8 years ago
Sorry, I don't know yet what the problem could be so far...

You are using version 1.0.74. The following problem has been solved in version
1.0.75: "Running out of memory could result in incomplete transactions or 
corrupted
databases." do you know if your application did run out of memory? How much 
memory
does your application use, how much is available?

Do you use large transactions (many thousand rows) sometimes?

> Connection got closed in shutdown hook 

In your test system, could you try shutdown / startup in a loop (in a batch 
file or
so)? Maybe the problems happens then and you couldn't reproduce it because this 
is
not (enough) tested.

I am wondering if the problem could be related encryption. Once you can 
reproduce the
problem, could you try if you can reproduce it without using file encryption?

Regards,
Thomas

Original comment by thomas.t...@gmail.com on 23 Oct 2008 at 12:28

GoogleCodeExporter commented 8 years ago
Hi,

There was no activity since quite some time. I am closing the bug as 'Invalid' 
until
there is more information available. Please re-open the bug if you know more!

Regards,
Thomas

Original comment by thomas.t...@gmail.com on 16 Dec 2008 at 8:40

GoogleCodeExporter commented 8 years ago
I'm in process of long term monitoring, sorry that there was no activity this 
open
issue (Sure, no one wants irrelevant issues to hang open).
After getting rid of jpa impl, using plain jdbc with org.h2.jdbcx pool

(all related to version 1.0.79)
1. It is surely not related to encryption. Corruption happened with unencrypted 
storage.

SQL state [HY000]; error code [50000]; General error: RuntimeException: File ID
mismatch got=0 expected=22 pos=1344 false
org.h2.store.DiskFile:/easysoft/.private/easysoft.persistence/provisioning-1.2/d
efault.index.db
blockCount:0 [50000-79]; nested exception is org.h2.jdbc.JdbcSQLException: 
General
error: RuntimeException: File ID mismatch got=0 expected=22 pos=1344 false 

mismatch got=74160014 expected=22 pos=1728 false
org.h2.store.DiskFile:/easysoft/.private/easysoft.persistence/business-1.2/23016
.index.db
blockCount:-1756040116 [50000-79]; nested exception is 
org.h2.jdbc.JdbcSQLException:
General error: RuntimeException: File ID mismatch got=74160014 expected=22 
pos=1728
false
org.h2.store.DiskFile:/easysoft/.private/easysoft.persistence/business-1.2/23016
.index.db
blockCount:-1756040116 [50000-79]

2. Maybe it is related to problems of filesystem. I got so often 'Read-only 
file system'
org.h2.jdbc.JdbcSQLException: IO Exception: Read-only file system;
/easysoft/.private/easysoft.persistence/business-1.2/14079.25467.temp.db 
[90031-79]
Caused by: IOException: Read-only file system
 RandomAccessFile.setLength()
 org.h2.util.FileUtils.setLength()
 org.h2.store.fs.FileObjectDisk.setFileLength()

*** I'll reopen issue if concrete problems found.

Caused by: org.h2.jdbc.JdbcSQLException: General error: NullPointerException 
[50000-79]
 org.h2.message.Message.getSQLException()
 org.h2.message.Message.convert()
 org.h2.table.TableData.addRow()
 org.h2.command.dml.Insert.update()
 org.h2.command.CommandContainer.update()
 org.h2.command.Command.executeUpdate()
 org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal()
 org.h2.jdbc.JdbcPreparedStatement.executeUpdate()
 spring.jdbc.core.JdbcTemplate.2.doInPreparedStatement()
 ... 27 more
Caused by: NullPointerException
 org.h2.store.DataPage.getValueLen()
 org.h2.index.BtreePage.getRowSize()
 org.h2.index.BtreeLeaf.getRealByteCount()
 org.h2.index.BtreePage.getSplitPoint()
 org.h2.index.BtreeLeaf.add()
 org.h2.index.BtreeNode.add()
 org.h2.index.BtreeIndex.add()

Original comment by e.lucash@gmail.com on 18 Dec 2008 at 4:15

GoogleCodeExporter commented 8 years ago
'Read-only file system'

What operating system, file system, and Java version do you use? There are 
problems
with RandomAccessFile.setLength() on some Linux / flash file systems, but I 
thought
they are fixed (with a workaround in H2).

If there is a problem with the index file, it can be deleted (indexes are
automatically re-created).

Original comment by thomas.t...@gmail.com on 24 Dec 2008 at 10:34

GoogleCodeExporter commented 8 years ago
I'm using as reported earlier
Linux 2.6.19-gentoo-r5
JRE 1.6.0_05-b13
Ext3 filesystem on normal hard drives
with H2 version 1.0.79

Can you elaborate on workaround for 'Read-only file system'?

Problem with indexes are that RuntimeException on corruption of indexes (it can 
be
data corruption as well)
causes current db operation to failure, and while kiosk pc does not have human
operator, software must analyze cause of failure and, perhaps, delete indexes 
and
retry operation.
Can you advice best strategy to do such maintenance automatically?
Can such maintenance be abstracted from appliction logic entirely? (By wrapping 
Jdbc
objects for example). Any known pitfalls?

Original comment by e.lucash@gmail.com on 24 Dec 2008 at 7:46

GoogleCodeExporter commented 8 years ago
There maybe a way to solve the problem. Could you append ;LOG=2 to the database 
URL?
See also: http://www.h2database.com/html/grammar.html#setlog
This should make sure the index file does not get corrupted. I'm not sure if it 
will
completely solve the problem however because of the 'Read-only file system' 
exception.

> I'm using as reported earlier
I'm sorry... 

> Can you elaborate on workaround for 'Read-only file system'?

RandomAccessFile.setLength() throws an exception on some Linux systems when 
trying to
expand the file size. The workaround (already implemented) to expand a file is 
to
write empty blocks at the end of the file. 

However in your case it looks like the exception happens when trying to shrink 
a file
(sorry I didn't see that before). The exception message you get ('Read-only file
system') is a bit strange. A possible workaround to shrink a file is: retry a 
few
times. But I'm not sure if we would hide the real problem.

Original comment by thomas.t...@gmail.com on 27 Dec 2008 at 9:38

GoogleCodeExporter commented 8 years ago
Thanks for explanation. I'll try to apply suggested solutions and report after 
a 
while on results

Original comment by e.lucash@gmail.com on 28 Dec 2008 at 3:04

GoogleCodeExporter commented 8 years ago
Please if you have time review this report (I do not want open new issue as I'm 
not
sure in anything)
I new got problem on several terminal machines: database cannot be opened

Unique index or primary key violation: SYS_ID  ON PUBLIC.SYS(ID) [23001-79]
23001/23001 (Help)
org.h2.jdbc.JdbcSQLException: Unique index or primary key violation: SYS_ID  ON
PUBLIC.SYS(ID) [23001-79]
    at org.h2.message.Message.getSQLException(Message.java:103)
    at org.h2.message.Message.getSQLException(Message.java:114)
    at org.h2.message.Message.getSQLException(Message.java:77)
    at org.h2.index.BaseIndex.getDuplicateKeyException(BaseIndex.java:160)
    at org.h2.index.TreeIndex.add(TreeIndex.java:58)
    at org.h2.table.TableData.addRowsToIndex(TableData.java:278)
    at org.h2.table.TableData.addIndex(TableData.java:210)
    at org.h2.engine.Database.open(Database.java:553)
    at org.h2.engine.Database.<init>(Database.java:210)
    at org.h2.engine.Engine.openSession(Engine.java:57)
    at org.h2.engine.Engine.openSession(Engine.java:126)
    at org.h2.engine.Engine.getSession(Engine.java:109) 

Linux 2.6.19-gentoo-r5
JRE 1.6.0_05-b13
Ext3 filesystem on normal hard drives
with H2 version 1.0.79

Tried open db on 1.1.107 and got same error.
Is this filesystem corruption or something misbehaving corruption?

Attached errorneus db (nonencrypted binary mode, user 'sa' password empty 
string '')

Original comment by e.lucash@gmail.com on 18 Feb 2009 at 1:49

GoogleCodeExporter commented 8 years ago
It's better to open a new issue. Anyway, I have a few question:

- Could you send the full stack trace of the exception including message text?
- What is your database URL?
- Do you use Tomcat or another web server? Do you unload or reload the web 
application?
- You can find out if the database is corrupted when running
    SCRIPT TO 'test.sql'
- What version H2 are you using?
- Did you use multiple connections?
- The first workarounds is: append ;RECOVER=1 to the database URL.
    Does it work when you do this?
- The second workarounds is: delete the index.db file
    (it is re-created automatically) and try again. Does it work when you do this?
- The third workarounds is: use the tool org.h2.tools.Recover to create
    the SQL script file, and then re-create the database using this script.
    Does it work when you do this?
- With which version of H2 was this database created?
    You can find it out using:
    select * from information_schema.settings where name='CREATE_BUILD'
- Do you use any settings or special features (for example, the setting LOG=0,
    or two phase commit, linked tables, cache settings)?
- Is the application multi-threaded?
- What operating system, file system, and virtual machine
    (java -version) do you use?
- Is it (or was it at some point) a networked file system?
- How big is the database (file sizes)?
- Is the database usually closed normally, or is process terminated forcefully
    or the computer switched off?
- Is it possible to reproduce this problem using a fresh database
    (sometimes, or always)?
- Are there any other exceptions (maybe in the .trace.db file)?
    Could you send them please?
- Do you still have any .trace.db files, and if yes could you send them?
- Could you send the .data.db file where this exception occurs?

Original comment by thomas.t...@gmail.com on 18 Feb 2009 at 8:07

GoogleCodeExporter commented 8 years ago
Please I want to use bacup in h2 first time but I dont solve this problem 

12-04 17:54:11 database: opening zip:C:/Program
Files/ANET/WCMPYShield/data/data1.zip!/store (build 107)
12-04 17:54:11 index: open existing SYS rows: 135
12-04 17:54:11 index: open existing HIBERNATE_UNIQUE_KEY rows: 0
12-04 17:54:11 index: open existing SITE_TYPE rows: 4
12-04 17:54:11 index: open existing LOOK_UP rows: 11
12-04 17:54:11 index: open existing F_USERS rows: 1
12-04 17:54:11 index: open existing ALWAYSWHITE rows: 0
12-04 17:54:11 index: open existing F_SPECIAL_KEYWORD_FILE rows: 165
12-04 17:54:11 index: open existing F_SPECIAL_WORDS rows: 2291
12-04 17:54:11 index: open existing MAIL rows: 1
12-04 17:54:11 index: open existing F_DAILY_CLOCK rows: 7
12-04 17:54:11 index: open existing F_GROUPS rows: 74
12-04 17:54:11 index: open existing C_IPINTERVAL rows: 0
12-04 17:54:11 index: open existing COMPUTER_GROUP rows: 0
12-04 17:54:11 index: open existing C_TEMPLATE rows: 0
12-04 17:54:11 index: open existing F_USERS_FORBIT rows: 74
12-04 17:54:11 index: open existing MAIL_SETTINGS rows: 21
12-04 17:54:11 index: open existing F_TIMER rows: 168
12-04 17:54:11 index: open existing COMPUTERS rows: 1
12-04 17:54:11 index: open existing F_WORDS rows: 0
12-04 17:54:11 index: open existing F_SITES rows: 0
12-04 17:54:11 index: open existing COM_TEMPLATE rows: 0
12-04 17:54:11 index: open existing F_SETTINGS rows: 6
12-04 17:54:11 index: open existing DEVICE_TYPE rows: 0
12-04 17:54:11 index: open existing PROTOCOL_TABLE rows: 1478
12-04 17:54:11 index: open existing GATEWAY rows: 0
12-04 17:54:11 index: open existing IP_MAC_LOG rows: 1
12-04 17:54:11 index: open existing IP_PORT_PROTOCOL rows: 0
12-04 17:54:11 index: open existing F_LOGS rows: 0
12-04 17:54:11 index: open existing CUSTOMER rows: 0
12-04 17:54:11 index: open existing LISANS_ID rows: 0
12-04 17:54:11 index: open existing LICENSE_CONTROL rows: 0
12-04 17:54:11 index: open existing FULL_METIN rows: 4
12-04 17:54:11 index: open existing DEC_FILE_TABLE rows: 0
12-04 17:54:11 lock: 1 exclusive write lock added for SITE_TYPE
12-04 17:54:11 database: CREATE PRIMARY KEY ON PUBLIC.SITE_TYPE(ID)
12-04 17:54:11 database: opening zip:C:/Program
Files/ANET/WCMPYShield/data/data1.zip!/store
org.h2.jdbc.JdbcSQLException: The database is read only; SQL statement:
CREATE PRIMARY KEY ON PUBLIC.SITE_TYPE(ID) [90097-107]
    at org.h2.message.Message.getSQLException(Message.java:103)
    at org.h2.message.Message.getSQLException(Message.java:114)
    at org.h2.message.Message.getSQLException(Message.java:77)
    at org.h2.message.Message.getSQLException(Message.java:149)
    at org.h2.engine.Database.checkWritingAllowed(Database.java:1674)
    at org.h2.store.FileStore.checkWritingAllowed(FileStore.java:166)
    at org.h2.store.FileStore.write(FileStore.java:338)
    at org.h2.store.DiskFile.writeDirectDeleted(DiskFile.java:942)
    at org.h2.store.DiskFile.setPageOwner(DiskFile.java:811)
    at org.h2.store.DiskFile.freePage(DiskFile.java:753)
    at org.h2.store.DiskFile.setUnused(DiskFile.java:698)
    at org.h2.store.DiskFile.truncateStorage(DiskFile.java:1110)
    at org.h2.store.Storage.truncate(Storage.java:353)
    at org.h2.index.BtreeIndex.truncate(BtreeIndex.java:356)
    at org.h2.index.BtreeIndex.<init>(BtreeIndex.java:91)
    at org.h2.table.TableData.addIndex(TableData.java:179)
    at org.h2.command.ddl.CreateIndex.update(CreateIndex.java:90)
    at org.h2.engine.MetaRecord.execute(MetaRecord.java:87)
    at org.h2.engine.Database.open(Database.java:573)
    at org.h2.engine.Database.<init>(Database.java:210)
    at org.h2.engine.Engine.openSession(Engine.java:57)
    at org.h2.engine.Engine.openSession(Engine.java:126)
    at org.h2.engine.Engine.getSession(Engine.java:109)
    at org.h2.engine.SessionFactoryEmbedded.createSession(SessionFactoryEmbedded.java:17)
    at org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:251)
    at org.h2.engine.SessionRemote.createSession(SessionRemote.java:229)
    at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:111)
    at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:95)
    at org.h2.Driver.connect(Driver.java:58)
    at java.sql.DriverManager.getConnection(Unknown Source)
    at java.sql.DriverManager.getConnection(Unknown Source)
    at h2bacukp.H2DatabaseConnection.getConnection(H2DatabaseConnection.java:13)
    at h2bacukp.Main.main(Main.java:10)
Exception in thread "main" java.lang.NullPointerException
    at h2bacukp.Main.main(Main.java:12)

Original comment by adagdelen25@gmail.com on 4 Dec 2009 at 3:55

GoogleCodeExporter commented 8 years ago
Hi

I don't think that your problem is related to this issue.
Please re-post your problem on the Google Group: 
http://groups.google.com/group/h2-database

Regards,
Thomas

Original comment by thomas.t...@gmail.com on 4 Dec 2009 at 4:01