sqlcipher / android-database-sqlcipher

Android SQLite API based on SQLCipher
https://www.zetetic.net/sqlcipher/sqlcipher-for-android/
Other
2.73k stars 564 forks source link

net.sqlcipher.database.SQLiteException: not an error: BEGIN EXCLUSIVE; #602

Open linchengchan opened 1 year ago

linchengchan commented 1 year ago

Expected Behavior

Want to known which scenes will occurrenced. So i can do something.

Actual Behavior

My project is an multipule processes application. Used android-database-sqlcipher(4.5.1) and androidx.room(2.2.0). Each process open an database on singleInstance. This crash was occurrenced by stability monkey test, and on firebase also has few this crash.I can't reproduce this crash by normal operation. Our JournalMode set JournalMode.TRUNCATE forced. On our code,we retry beginTransaction when accept SQLiteDatabaseException(message is startwith database is locked).

boolean retry; int tryCount = 0; do { retry = false; try { if (withListener) { mDelegate.beginTransactionWithListener(transactionListener); } else { mDelegate.beginTransaction(); } if (DEBUG && tryCount > 0) { Log.d(TAG, "beginTransactionSafely()-> tryCount = " + tryCount); } } catch (SQLiteException exception) { String message = exception.getMessage(); if (message != null && message.startsWith("database is locked")) { if (DEBUG) { Log.d(TAG, "beginTransactionSafely()-> " + message); } long sleepTime = Math.min(160, (tryCount + 1) * 20); SystemClock.sleep(sleepTime); tryCount++; retry = true; } else { throw exception; } } } while (retry);

Steps to Reproduce

I query sqlite doc how problem occur.But nothing get. Also i read sqlite source code, but i can't understander all code. Only i can known is sqlite3_step() return SQLite_OK but not SQLite_Done. I don't known why return SQLite_OK and which scenes occurrence.

SQLCipher version (can be identified by executing PRAGMA cipher_version;): 4.5.1 community SQLCipher for Android version: 4.5.1

Crash Stack

Build Info: 'unknown:alps-mp-q0.mp1-V3_reallytek.q0mp1.k61v1.64.bsp_P93:mt6761:S01,TECNO/H624/TECNO-KB8:10/QP1A.190711.020/BNPQ-201230V852:user/release-keys'
MSSI Info: 'TECNO/H624/TECNO-KB8:10/QP1A.190711.020/BNPQ-201230V852:user/release-keys'
Exception Log Time:[Tue Oct 11 13:43:41 WAT 2022] [7593.371736]

Exception Class: Java (JE)
Exception Type: system_app_crash

Current Executing Process: 
net.bat.store
net.bat.store v2911 (V5.1.39.11)

Trigger time:[2022-10-11 13:43:41.357436] pid:11587

Backtrace: 
Process: net.bat.store
PID: 11587
UID: 10064
Flags: 0x38d83ec5
Package: net.bat.store v2911 (V5.1.39.11)
Foreground: Yes
Build: TECNO/H624/TECNO-KB8:10/QP1A.190711.020/BNPQ-201230V852:user/release-keys

net.sqlcipher.database.SQLiteException: not an error: BEGIN EXCLUSIVE;
    at net.sqlcipher.database.SQLiteDatabase.native_execSQL(Native Method)
    at net.sqlcipher.database.SQLiteDatabase.execSQL(SourceFile:4)
    at net.sqlcipher.database.SQLiteDatabase.beginTransactionWithListenerInternal(SourceFile:8)
    at net.sqlcipher.database.SQLiteDatabase.beginTransactionWithListener(SourceFile:1)
    at net.sqlcipher.database.SQLiteDatabase.beginTransaction(SourceFile:1)
    at net.bat.store.datamanager.db.AhaSupportSQLiteDatabase.beginTransactionSafely(SourceFile:2)
    at net.bat.store.datamanager.db.AhaSupportSQLiteDatabase.beginTransaction(SourceFile:1)
    at androidx.room.RoomDatabase.beginTransaction(SourceFile:4)
    at net.bat.store.datamanager.dao.GameDao_Impl.insertOrIgnoreGame(SourceFile:2)
    at net.bat.store.datamanager.server.GameDaoImpl.insertOrIgnoreGame(SourceFile:1)
    at net.bat.store.datamanager.client.proxy.GameDaoProxy.insertOrIgnoreGame(SourceFile:1)
    at net.bat.store.runtime.util.OpenGame$1.run(SourceFile:1)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
    at net.bat.store.thread.BaseThreadPoolExecutor$BaseThreadFactory.lambda$newThread$0(SourceFile:2)
    at net.bat.store.thread.-$$Lambda$BaseThreadPoolExecutor$BaseThreadFactory$lEX5z4jVWLqetPCsXEvyDjhH5Vg.run(Unknown Source:2)
    at java.lang.Thread.run(Thread.java:919)

MyCode

GameDao_Impl.java
public void insertOrIgnoreGame(final GameTable gameTable) {
    __db.assertNotSuspendingTransaction();
    __db.beginTransaction();
    try {
      __insertionAdapterOfGameTable_1.insert(gameTable);
      __db.setTransactionSuccessful();
    } finally {
      __db.endTransaction();
    }
  }
AhaSupportSQLiteDatabase.java
    public void beginTransaction() {
        beginTransactionSafely(false, null);
    }

    private void beginTransactionSafely(boolean withListener, SQLiteTransactionListener transactionListener) {
        boolean retry;
        int tryCount = 0;
        do {
            retry = false;
            try {
                if (withListener) {
                    mDelegate.beginTransactionWithListener(transactionListener);
                } else {
                    mDelegate.beginTransaction();
                }
                if (DEBUG && tryCount > 0) {
                    Log.d(TAG, "beginTransactionSafely()-> tryCount = " + tryCount);
                }
            } catch (SQLiteException exception) {
                String message = exception.getMessage();
                if (message != null && message.startsWith("database is locked")) {
                    if (DEBUG) {
                        Log.d(TAG, "beginTransactionSafely()-> " + message);
                    }
                    long sleepTime = Math.min(160, (tryCount + 1) * 20);
                    SystemClock.sleep(sleepTime);
                    tryCount++;
                    retry = true;
                } else {
                    throw exception;
                }
            }
        } while (retry);
    }

I can provider all stability log if you need!

developernotes commented 1 year ago

Hi @linchengchan

Thank you for reaching out and sharing some details of the situation, that is helpful. Can you confirm whether you are using the legacy edition of SQLCipher for Android ^1, or the new version of SQLCipher for Android ^2? Our initial suggestion would be to set a busy_timeout ^3 to some value such as 5000 ms to see if that alleviates some of the issue, this will allow your application to handle both locked and busy states. In doing this, you could remove retry logic from the Java source.

linchengchan commented 1 year ago

Thank you for your reply @developernotes

  1. we used legcy edition of SQLCipher for Android. New version it's my first time i known on your reply.
  2. I will try use busy_timeout and take an stability monkey test. After has monkey test result i will response on this issue. Thank you for your suggestion again.

    By the way

    Another question bothers me. When endTransaction may happy COMMIT fail, I don't know if it is necessary to deal with when catch SQLiteException, will sqlite auto rollback when commit fail.

    crash stack

    Fatal Exception: net.sqlcipher.database.SQLiteDiskIOException: disk I/O error: COMMIT;
       at net.sqlcipher.database.SQLiteDatabase.native_execSQL(SourceFile)
       at net.sqlcipher.database.SQLiteDatabase.execSQL(SourceFile:4)
       at net.sqlcipher.database.SQLiteDatabase.endTransaction(SourceFile:15)
       at net.bat.store.datamanager.db.AhaSupportSQLiteDatabase.endTransaction(SourceFile:1)
       at androidx.room.RoomDatabase.endTransaction(SourceFile:1)
       at net.bat.store.datamanager.dao.GameDao_Impl.insertOrReplaceGameList(SourceFile:5)
       at net.bat.store.datamanager.server.GameDaoImpl.insertOrReplaceGameList(SourceFile:1)
       at net.bat.store.datamanager.client.proxy.GameDaoProxy.insertOrReplaceGameList(SourceFile:1)
       at net.bat.store.repo.impl.BaseRecommendGameRepo$2.run(SourceFile:12)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
       at net.bat.store.thread.BaseThreadPoolExecutor$BaseThreadFactory.lambda$newThread$0(SourceFile:2)
       at net.bat.store.thread.-$$Lambda$BaseThreadPoolExecutor$BaseThreadFactory$lEX5z4jVWLqetPCsXEvyDjhH5Vg.run(:2)
       at java.lang.Thread.run(Thread.java:764)

    My Code

    GameDao_Impl.java
    public void insertOrIgnoreGameList(final List<GameTable> list) {
    __db.assertNotSuspendingTransaction();
    __db.beginTransaction();
    try {
      __insertionAdapterOfGameTable_1.insert(list);
      __db.setTransactionSuccessful();
    } finally {
      __db.endTransaction();
    }
    }

    In my opinion, this happy nothing to do with lib cipher and sqlite. But about user phone state and system running state. From an application point of view, we also no wish crash for user, so we want to deal with some compatibility handle. Did i need rollback manual?Or there is a better solution?

developernotes commented 1 year ago

Hi @linchengchan

Using the following approach is suggested:

db.beginTransaction();
try {
  // various SQL operations
  db.setTransactionSuccessful();
}
finally {
  db.endTransaction();
}

The finally block will always occur, performing a rollback of the transaction. You may optionally include a catch clause.

linchengchan commented 1 year ago

Thank you for your reply @developernotes I don't quite understand what you mean.

  1. db.beginTransaction();
    try {
    // various SQL operations
    db.setTransactionSuccessful();
    }
    finally {
    try{
    db.endTransaction();
    }catch(SQLiteException ignore){
    // nothing need to do
    }
    }
  2. db.beginTransaction();
    try {
    // various SQL operations
    db.setTransactionSuccessful();
    }
    finally {
    try{
    db.endTransaction();
    }catch(SQLiteException exception){
    db.execSQL("ROLLBACK;");
    }
    }

    Which one did you say?

developernotes commented 1 year ago

Hi @linchengchan

Are you concerned with the call to endTransaction() throwing an exception? endTransaction() performs the COMMIT; or ROLLBACK; operation.

linchengchan commented 1 year ago

Hi @developernotes Yes, actually we meet this crash.The quote is real happened. The question I ask is for this kind of crash.

Thank you for your reply @developernotes

  1. we used legcy edition of SQLCipher for Android. New version it's my first time i known on your reply.
  2. I will try use busy_timeout and take an stability monkey test. After has monkey test result i will response on this issue. Thank you for your suggestion again.

By the way

Another question bothers me. When endTransaction may happy COMMIT fail, I don't know if it is necessary to deal with when catch SQLiteException, will sqlite auto rollback when commit fail.

crash stack

Fatal Exception: net.sqlcipher.database.SQLiteDiskIOException: disk I/O error: COMMIT;
       at net.sqlcipher.database.SQLiteDatabase.native_execSQL(SourceFile)
       at net.sqlcipher.database.SQLiteDatabase.execSQL(SourceFile:4)
       at net.sqlcipher.database.SQLiteDatabase.endTransaction(SourceFile:15)
       at net.bat.store.datamanager.db.AhaSupportSQLiteDatabase.endTransaction(SourceFile:1)
       at androidx.room.RoomDatabase.endTransaction(SourceFile:1)
       at net.bat.store.datamanager.dao.GameDao_Impl.insertOrReplaceGameList(SourceFile:5)
       at net.bat.store.datamanager.server.GameDaoImpl.insertOrReplaceGameList(SourceFile:1)
       at net.bat.store.datamanager.client.proxy.GameDaoProxy.insertOrReplaceGameList(SourceFile:1)
       at net.bat.store.repo.impl.BaseRecommendGameRepo$2.run(SourceFile:12)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
       at net.bat.store.thread.BaseThreadPoolExecutor$BaseThreadFactory.lambda$newThread$0(SourceFile:2)
       at net.bat.store.thread.-$$Lambda$BaseThreadPoolExecutor$BaseThreadFactory$lEX5z4jVWLqetPCsXEvyDjhH5Vg.run(:2)
       at java.lang.Thread.run(Thread.java:764)

My Code

GameDao_Impl.java
  public void insertOrIgnoreGameList(final List<GameTable> list) {
    __db.assertNotSuspendingTransaction();
    __db.beginTransaction();
    try {
      __insertionAdapterOfGameTable_1.insert(list);
      __db.setTransactionSuccessful();
    } finally {
      __db.endTransaction();
    }
  }

In my opinion, this happy nothing to do with lib cipher and sqlite. But about user phone state and system running state. From an application point of view, we also no wish crash for user, so we want to deal with some compatibility handle. Did i need rollback manual?Or there is a better solution?

developernotes commented 1 year ago

Hi @linchengchan

I'm sorry, I misunderstood your follow-up. The disk I/O error on commit is certainly an unexpected behavior. While you might consider the WAL journal mode, because you were unable to commit the transaction due to a disk error, issuing a ROLLBACK would be unnecessary, as the commit was never successful. In addition, if the journal file remains present on disk following the disk error, when SQLCipher first accesses the database it will determine a journal file is present and attempt to write the contents of the journal into the database for you, deleting the journal (or truncating it in your case).

You may wish to report an error to the user in the event a database transaction cannot complete due to a disk error.

linchengchan commented 1 year ago

Thanks for your patient reply! @developernotes I will try you suggestion and test. Thank you again!

linchengchan commented 1 year ago

Hi @developernotes As your suggestion. We set busy_timeout to 10000ms on SQLiteDatabaseHook.postKey. On my stability test, we also meet net.sqlcipher.database.SQLiteException: not an error: BEGIN EXCLUSIVE; I can't understand why message is not an error? On sqlite3.c source code, "not an error" mean SQLITE_OK.

Keep used android-database-sqlcipher(4.5.1) and androidx.room(2.2.0).

Set busy_timeout code

    public void postKey(SQLiteDatabase database) {
        Log.d(TAG, "postKey()-> ");
        boolean usable;
        try (Cursor cursor = database.rawQuery("select count(*) from sqlite_master;", new String[]{})){
            if (cursor != null) {
                cursor.moveToFirst();
                cursor.getInt(0);
            }
            usable=cursor!=null;
        } catch (RuntimeException e) {
            if (DEBUG) {
                Log.d(TAG, "postKey()-> ", e);
            }
            usable=false;
        }
        if (usable){
            settings(database);
        }

    }

    static void settings(SQLiteDatabase database) {
        queryPragmaSettings(database, "SELECT sqlite_version();");
        if (DEBUG) {
            queryPragmaSettings(database, "PRAGMA busy_timeout;");
        }
        if (DEBUG) {
            Log.d(TAG, "postKey()-> set timeout to 10s");
        }
        queryPragmaSettings(database, "PRAGMA busy_timeout = 10000;");
    }

    public static void queryPragmaSettings(SQLiteDatabase database, String sql) {
        Cursor cursor = database.rawQuery(sql,new String[0]);
        int columnCount = cursor.getColumnCount();
        if (columnCount > 0) {
            while (cursor.moveToNext()) {
                for (int i = 0; i < columnCount; i++) {
                    String key = cursor.getColumnName(i);
                    String value = cursor.getString(i);
                    Log.d(TAG, key + " = " + value);
                }
            }
        } else {
            Log.d(TAG, "not found column by " + sql);
        }
        cursor.close();
    }

Exception Stack Code

net.sqlcipher.database.SQLiteException: not an error: BEGIN EXCLUSIVE;
    at net.sqlcipher.database.SQLiteDatabase.native_execSQL(Native Method)
    at net.sqlcipher.database.SQLiteDatabase.execSQL(SourceFile:4)
    at net.sqlcipher.database.SQLiteDatabase.beginTransactionWithListenerInternal(SourceFile:8)
    at net.sqlcipher.database.SQLiteDatabase.beginTransactionWithListener(SourceFile:1)
    at net.sqlcipher.database.SQLiteDatabase.beginTransaction(SourceFile:1)
    at net.bat.store.datamanager.db.AhaSupportSQLiteDatabase.beginTransaction(SourceFile:1)
    at androidx.room.RoomDatabase.beginTransaction(SourceFile:4)
    at net.bat.store.datamanager.dao.DownloadDao_Impl.clearRecords(SourceFile:7)
    at net.bat.store.download.LocalRecorderImpl.enqueued(SourceFile:1)
    at net.bat.store.download.CommonCallback.onWait(SourceFile:7)
    at net.bat.store.download.ProxyCallback.onWait(SourceFile:3)
    at net.bat.store.download.ProxyEnqueueCallback.enqueued(SourceFile:2)
    at net.bat.store.download.DownloadDispatcher.enqueued(SourceFile:3)
    at net.bat.store.download.DownloadDispatcher.enqueue(SourceFile:7)
    at net.bat.store.download.DownloadCall.result(SourceFile:10)
    at net.bat.store.download.PreHandler.response(SourceFile:4)
    at net.bat.store.download.PreHandler.run(SourceFile:30)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
    at net.bat.store.thread.BaseThreadPoolExecutor$BaseThreadFactory.lambda$newThread$0(SourceFile:2)
    at net.bat.store.thread.-$$Lambda$BaseThreadPoolExecutor$BaseThreadFactory$lEX5z4jVWLqetPCsXEvyDjhH5Vg.run(Unknown Source:2)
    at java.lang.Thread.run(Thread.java:923)
    at net.bat.store.thread.BaseThreadPoolExecutor$BaseThread.run(SourceFile:2)

Db operation

  public void clearRecords(final String packageName, final String excludeId) {
    __db.assertNotSuspendingTransaction();
    final SupportSQLiteStatement _stmt = __preparedStmtOfClearRecords.acquire();
    int _argIndex = 1;
    if (packageName == null) {
      _stmt.bindNull(_argIndex);
    } else {
      _stmt.bindString(_argIndex, packageName);
    }
    _argIndex = 2;
    if (excludeId == null) {
      _stmt.bindNull(_argIndex);
    } else {
      _stmt.bindString(_argIndex, excludeId);
    }
    __db.beginTransaction();
    try {
      _stmt.executeUpdateDelete();
      __db.setTransactionSuccessful();
    } finally {
      __db.endTransaction();
      __preparedStmtOfClearRecords.release(_stmt);
    }
  }

Hi @linchengchan

Thank you for reaching out and sharing some details of the situation, that is helpful. Can you confirm whether you are using the legacy edition of SQLCipher for Android 1, or the new version of SQLCipher for Android 2? Our initial suggestion would be to set a busy_timeout 3 to some value such as 5000 ms to see if that alleviates some of the issue, this will allow your application to handle both locked and busy states. In doing this, you could remove retry logic from the Java source.

Footnotes

  1. https://github.com/sqlcipher/android-database-sqlcipher/
  2. https://github.com/sqlcipher/sqlcipher-android/
  3. https://www.sqlite.org/pragma.html#pragma_busy_timeout