OCFL / ocfl-java

A Java OCFL implementation
MIT License
18 stars 12 forks source link

Add support for MariaDB databases #53

Closed MormonJesus69420 closed 2 years ago

MormonJesus69420 commented 2 years ago

This PR adds support for MariaDB databases to ocfl-java library. It is based on pre-existing classes and should not introduce any breaking changes. Database code was tested on our end, both with unit tests and manual testing on a live database, no errors were found during the testing phase, lock and out of sync errors functioned as expected.
Edits to pre-existing code are minor, as they only pertain to making SQL queries protected instead of private as MariaDB uses different queries for updating tables. The other queries were made protected for consistency. For loop in DbTypes was changed to a stream operation, and comments were added to JavaDoc and Readme to reflect the changes.

pwinckles commented 2 years ago

Thank you for the PR!

Are you planning on using the db lock as well?

MormonJesus69420 commented 2 years ago

Thank you for the PR!

Are you planning on using the db lock as well?

Oh right, that's something I forgot about, we should probably implement that too, even though as far as I am aware we will not use it. However you never know when you might need it, and someone else might have a use for it.

I will implement it on Monday, as I am about to end my work for today.

MormonJesus69420 commented 2 years ago

Hey there Peter! I have added support for object locking using database, so the PR should be ready for merging

pwinckles commented 2 years ago

Thanks!

There are a few tests that are setup to run against an external database, rather than H2. I just attempted to run them, and I'm getting some lock related failures.

Here's an example of how to run the tests on the CLI:

mvn -Ddb.url="jdbc:mariadb://localhost:3306/ocfl" -Ddb.user="ocfl-user" -Ddb.password="ocfl-pw" clean install

The test class in question are DbObjectLockTest and ObjectDetailsDatabaseTest.

MormonJesus69420 commented 2 years ago

Ah, so that's how you run those tests on a db. 😅 Yes I can see that ObjectDetailsDatabaseTest.shouldFailWhenConcurrentAddAndDifferentDigest fails. I am unsure if the problem is related to the fact that test is set up with H2 and Postgres databases in mind where lock timeout is set in milliseconds, wheras in MariaDB it's set in seconds. I will look into it more tomorrow.

pwinckles commented 2 years ago

the fact that test is set up with H2 and Postgres databases in mind where lock timeout is set in milliseconds, wheras in MariaDB it's set in seconds. I will look into it more tomorrow.

Ah! That could definitely be it

MormonJesus69420 commented 2 years ago

Just to make sure, those are the errors you are getting on your end too, right?

[ERROR] Failures: 
[ERROR]   ObjectDetailsDatabaseTest.shouldFailWhenConcurrentAddAndDifferentDigest:569 Unexpected exception type thrown ==> expected: <edu.wisc.library.ocfl.api.exception.ObjectOutOfSyncException> but was: <edu.wisc.library.ocfl.api.exception.OcflDbException>
[ERROR]   DbObjectLockTest.onConcurrentAcquireOnlyOneProcessShouldGetLockWhenLockExpired:242 expected: <false> but was: <true>
[ERROR] Errors: 
[ERROR]   ObjectDetailsDatabaseTest.shouldSucceedWhenConcurrentAddAndSameDigest:531 » Lock
[ERROR]   DbObjectLockTest.shouldAcquireLockWhenAlreadyExistsButNotHeld:66 » Lock Failed...
pwinckles commented 2 years ago

Yep, I'm seeing the same

pwinckles commented 2 years ago

Also, if you all aren't going to use the db lock, then you don't necessarily have to add support for it. In that case, we'd just want to throw an exception if MariaDB was attempted to be used for the lock.

MormonJesus69420 commented 2 years ago

The 3f3def0 commit should fix errors picked up by unit tests in DbObjectLock. I didn't realize that in MariaDB TIMESTAMP fields don't include milliseconds by default. This is now fixed.

MormonJesus69420 commented 2 years ago

The 53000bb commit fixes the errors in ObjectDetailsDatabase test, i think at least. It makes the test wait a few milliseconds between sending requests as sometimes the order got mixed and test ended up in a deadlock. I am not sure if it's a proper fix or no, but in manual testing it worked well in such cases, so I am inclined to believe the code is in order.

However with both of these issues fixed I am getting following error in the S3 integration tests which seem to go in a loop:

14:14:14.927 [C3P0PooledConnectionPoolManager[identityToken->z8kfsxak1gara8xoj7p8x|1013871e]-HelperThread-#1] WARN  c.m.v.resourcepool.BasicResourcePool - com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@3412968e -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (30). Last acquisition attempt exception:
java.sql.SQLException: No suitable driver
at java.sql/java.sql.DriverManager.getDriver(DriverManager.java:298)
at com.mchange.v2.c3p0.DriverManagerDataSource.driver(DriverManagerDataSource.java:285)
at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1176)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1163)
at com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
at com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1908)
at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)

I am guessing that it's because it doesn't support MariaDB, but I would like to say that it's strange that the test is stuck in a loop for the past 30 minutes without exiting.

pwinckles commented 2 years ago

For that, you need to add the mariadb driver as a test dependency of the itest module

MormonJesus69420 commented 2 years ago

Yes, it looks like it worked, I thought there might be something more since the test went in loop and wouldn't terminate, but adding the driver seems to have fixed the issue.

pwinckles commented 2 years ago

The changes to the tests to avoid deadlocking aren't good because they subvert the purpose of the tests. That code is testing concurrent writes, so we can't insert a delay between them.

I haven't identified the root cause of the problem yet.

pwinckles commented 2 years ago

Okay, what I think is happening is that MariaDB just exhibits different behavior when there's a key conflict. When you have 2 transactions that concurrently add a row with the same key, Postgres and H2 throw an exception that indicates that there's a duplicate key. This exception is expected, and is handled here.

However, MariaDB appears to throw a deadlock exception rather than a duplicate key exception. The following fixes the tests:

if (duplicateKeyCode.equals(e.getSQLState()) || e.getErrorCode() == 1213) {
    throw outOfSyncException(inventory.getId());
}

Perhaps the refactor should be to push the duplicateKeyCode field down, and add the following abstract method to BaseObjectDetailsDatabase:

protected abstract boolean isConcurrentWriteException(SQLException e);

And then the code could be updated to be:

if (isConcurrentWriteException(e)) {
    throw outOfSyncException(inventory.getId());
}
MormonJesus69420 commented 2 years ago

The changes to the tests to avoid deadlocking aren't good because they subvert the purpose of the tests. That code is testing concurrent writes, so we can't insert a delay between them.

Yes, I knew it was a hack solution.

I just can't figure out why a deadlock occurs when we send two concurrent requests, is it the fault of the driver that is trying to prevent a deadlock or something else?

MormonJesus69420 commented 2 years ago
if (isConcurrentWriteException(e)) {
    throw outOfSyncException(inventory.getId());
}

This seems like a good solution might also consider renaming the duplicateKeyCode to something like concurrentInsertErrorCode and then in MariaDb implementation it would be set to 40001 as this is the code returned by e.getSQLState() unless it's preferable to use e.getErrorCode() in case of MariaDB's driver?
In this case there would either be no need for adding the protected abstract boolean isConcurrentWriteException(SQLException e) method and we could have old error handling method. Alternatively it could have a concrete implementation in BaseObjectDetailsDatabase.

pwinckles commented 2 years ago

This seems like a good solution might also consider renaming the duplicateKeyCode to something like concurrentInsertErrorCode and then in MariaDb implementation it would be set to 40001 as this is the code returned by e.getSQLState()

Yep, I'm fine with this solution

MormonJesus69420 commented 2 years ago

Great, I will try to get it done after getting home from work as it's a quick fix.

pwinckles commented 2 years ago

Hmm... actually, I'm a little curious if MariaDB could still throw the duplicate key exception. I think you're getting the deadlock because there are two transactions concurrently attempting to write the same key. However, what would happen if the threads are slightly offset? While one is inserting, the other does the check that determines if it should do an insert or an update, and follows the insert branch. Then, the first thread commits its transaction, and then the second thread does its insert. In that case, I would be a little surprised if there was a deadlock because the first tx has been committed.

All that to say, I think that MaraiDB may need to look for both exceptions.

MormonJesus69420 commented 2 years ago

Hmm, you might be right I will look into it tomorrow if I have the time

pwinckles commented 2 years ago

Yeah, I'm pretty sure that's what would happen. If you look at the DbObjectLock class, it does an insert when it creates a new lock. However, its insert is not in a transaction and it throws a duplicate key exception on concurrent inserts. So, the code in the details db class needs to look for both exceptions.

MormonJesus69420 commented 2 years ago

Hey Peter!
I have now added the more proper checking for concurrency issues using the suggested method, along with refactoring.
I have added two tests locally, which check for almost concurrent issues, in which case MariaDB driver throws a "duplicate key" error instead of a "deadlock". I have tested that these tests pass on H2 and Postgres databases too, but I didn't add them in the commit, if you want to I can include them too, but I don't think it's necessary.
Hopefully things will work as they should now.

Here are the two tests I have added locally, they are almost the same as the normal tests you have, with the exception of adding a sleep call for 10 milliseconds

   @Test
    public void shouldSucceedWhenAlmostConcurrentAddAndSameDigest() throws InterruptedException, ExecutionException {
        database = new ObjectDetailsDatabaseBuilder()
                .waitTime(1000, TimeUnit.MILLISECONDS)
                .dataSource(dataSource)
                .tableName(tableName)
                .build();

        var inventory = basicInventory();
        var invBytes = inventoryBytes(inventory);
        var digest = DigestUtil.computeDigestHex(inventory.getDigestAlgorithm(), invBytes);

        var phaser = new Phaser(2);

        var future = executor.submit(() -> {
            phaser.arriveAndAwaitAdvance();
            database.addObjectDetails(inventory, digest, invBytes);
        });

        phaser.arriveAndAwaitAdvance();
        TimeUnit.MILLISECONDS.sleep(10); //// SLEEP
        database.addObjectDetails(inventory, digest, invBytes);

        future.get();

        var details = database.retrieveObjectDetails(inventory.getId());

        assertObjectDetails(inventory, digest, invBytes, details);
    }

    @Test
    public void shouldFailWhenAlmostConcurrentAddAndDifferentDigest() {
        database = new ObjectDetailsDatabaseBuilder()
                .waitTime(1, TimeUnit.SECONDS)
                .dataSource(dataSource)
                .tableName(tableName)
                .build();

        var inventory = basicInventory();
        var invBytes = inventoryBytes(inventory);
        var digest = DigestUtil.computeDigestHex(inventory.getDigestAlgorithm(), invBytes);

        var inv2 = inventory.buildFrom().build();
        var invBytes2 = inventoryBytes(inv2);
        var digest2 = "bogus";

        var phaser = new Phaser(3);

        var future = executor.submit(() -> {
            phaser.arriveAndAwaitAdvance();
            database.addObjectDetails(inv2, digest2, invBytes2);
        });
        var future2 = executor.submit(() -> {
            phaser.arriveAndAwaitAdvance();
            try {
                TimeUnit.MILLISECONDS.sleep(10); //// SLEEP
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            database.addObjectDetails(inventory, digest, invBytes);
        });

        phaser.arriveAndAwaitAdvance();

        assertThrows(ObjectOutOfSyncException.class, () -> {
            try {
                future.get();
                future2.get();
            } catch (ExecutionException e) {
                throw e.getCause();
            }
        });
    }