Closed MormonJesus69420 closed 2 years ago
Thank you for the PR!
Are you planning on using the db lock as well?
Thank you for the PR!
Are you planning on using the db lock as well?
Oh right, that's something I forgot about, we should probably implement that too, even though as far as I am aware we will not use it. However you never know when you might need it, and someone else might have a use for it.
I will implement it on Monday, as I am about to end my work for today.
Hey there Peter! I have added support for object locking using database, so the PR should be ready for merging
Thanks!
There are a few tests that are setup to run against an external database, rather than H2. I just attempted to run them, and I'm getting some lock related failures.
Here's an example of how to run the tests on the CLI:
mvn -Ddb.url="jdbc:mariadb://localhost:3306/ocfl" -Ddb.user="ocfl-user" -Ddb.password="ocfl-pw" clean install
The test class in question are DbObjectLockTest and ObjectDetailsDatabaseTest.
Ah, so that's how you run those tests on a db. 😅
Yes I can see that ObjectDetailsDatabaseTest.shouldFailWhenConcurrentAddAndDifferentDigest
fails. I am unsure if the problem is related to the fact that test is set up with H2 and Postgres databases in mind where lock timeout is set in milliseconds, wheras in MariaDB it's set in seconds. I will look into it more tomorrow.
the fact that test is set up with H2 and Postgres databases in mind where lock timeout is set in milliseconds, wheras in MariaDB it's set in seconds. I will look into it more tomorrow.
Ah! That could definitely be it
Just to make sure, those are the errors you are getting on your end too, right?
[ERROR] Failures:
[ERROR] ObjectDetailsDatabaseTest.shouldFailWhenConcurrentAddAndDifferentDigest:569 Unexpected exception type thrown ==> expected: <edu.wisc.library.ocfl.api.exception.ObjectOutOfSyncException> but was: <edu.wisc.library.ocfl.api.exception.OcflDbException>
[ERROR] DbObjectLockTest.onConcurrentAcquireOnlyOneProcessShouldGetLockWhenLockExpired:242 expected: <false> but was: <true>
[ERROR] Errors:
[ERROR] ObjectDetailsDatabaseTest.shouldSucceedWhenConcurrentAddAndSameDigest:531 » Lock
[ERROR] DbObjectLockTest.shouldAcquireLockWhenAlreadyExistsButNotHeld:66 » Lock Failed...
Yep, I'm seeing the same
Also, if you all aren't going to use the db lock, then you don't necessarily have to add support for it. In that case, we'd just want to throw an exception if MariaDB was attempted to be used for the lock.
The 3f3def0 commit should fix errors picked up by unit tests in DbObjectLock. I didn't realize that in MariaDB TIMESTAMP
fields don't include milliseconds by default. This is now fixed.
The 53000bb commit fixes the errors in ObjectDetailsDatabase test, i think at least. It makes the test wait a few milliseconds between sending requests as sometimes the order got mixed and test ended up in a deadlock. I am not sure if it's a proper fix or no, but in manual testing it worked well in such cases, so I am inclined to believe the code is in order.
However with both of these issues fixed I am getting following error in the S3 integration tests which seem to go in a loop:
14:14:14.927 [C3P0PooledConnectionPoolManager[identityToken->z8kfsxak1gara8xoj7p8x|1013871e]-HelperThread-#1] WARN c.m.v.resourcepool.BasicResourcePool - com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@3412968e -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (30). Last acquisition attempt exception:
java.sql.SQLException: No suitable driver
at java.sql/java.sql.DriverManager.getDriver(DriverManager.java:298)
at com.mchange.v2.c3p0.DriverManagerDataSource.driver(DriverManagerDataSource.java:285)
at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1176)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1163)
at com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
at com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1908)
at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
I am guessing that it's because it doesn't support MariaDB, but I would like to say that it's strange that the test is stuck in a loop for the past 30 minutes without exiting.
For that, you need to add the mariadb driver as a test dependency of the itest module
Yes, it looks like it worked, I thought there might be something more since the test went in loop and wouldn't terminate, but adding the driver seems to have fixed the issue.
The changes to the tests to avoid deadlocking aren't good because they subvert the purpose of the tests. That code is testing concurrent writes, so we can't insert a delay between them.
I haven't identified the root cause of the problem yet.
Okay, what I think is happening is that MariaDB just exhibits different behavior when there's a key conflict. When you have 2 transactions that concurrently add a row with the same key, Postgres and H2 throw an exception that indicates that there's a duplicate key. This exception is expected, and is handled here.
However, MariaDB appears to throw a deadlock exception rather than a duplicate key exception. The following fixes the tests:
if (duplicateKeyCode.equals(e.getSQLState()) || e.getErrorCode() == 1213) {
throw outOfSyncException(inventory.getId());
}
Perhaps the refactor should be to push the duplicateKeyCode
field down, and add the following abstract method to BaseObjectDetailsDatabase
:
protected abstract boolean isConcurrentWriteException(SQLException e);
And then the code could be updated to be:
if (isConcurrentWriteException(e)) {
throw outOfSyncException(inventory.getId());
}
The changes to the tests to avoid deadlocking aren't good because they subvert the purpose of the tests. That code is testing concurrent writes, so we can't insert a delay between them.
Yes, I knew it was a hack solution.
I just can't figure out why a deadlock occurs when we send two concurrent requests, is it the fault of the driver that is trying to prevent a deadlock or something else?
if (isConcurrentWriteException(e)) { throw outOfSyncException(inventory.getId()); }
This seems like a good solution might also consider renaming the duplicateKeyCode
to something like concurrentInsertErrorCode
and then in MariaDb implementation it would be set to 40001
as this is the code returned by e.getSQLState()
unless it's preferable to use e.getErrorCode()
in case of MariaDB's driver?
In this case there would either be no need for adding the protected abstract boolean isConcurrentWriteException(SQLException e)
method and we could have old error handling method. Alternatively it could have a concrete implementation in BaseObjectDetailsDatabase
.
This seems like a good solution might also consider renaming the
duplicateKeyCode
to something likeconcurrentInsertErrorCode
and then in MariaDb implementation it would be set to40001
as this is the code returned bye.getSQLState()
Yep, I'm fine with this solution
Great, I will try to get it done after getting home from work as it's a quick fix.
Hmm... actually, I'm a little curious if MariaDB could still throw the duplicate key exception. I think you're getting the deadlock because there are two transactions concurrently attempting to write the same key. However, what would happen if the threads are slightly offset? While one is inserting, the other does the check that determines if it should do an insert or an update, and follows the insert branch. Then, the first thread commits its transaction, and then the second thread does its insert. In that case, I would be a little surprised if there was a deadlock because the first tx has been committed.
All that to say, I think that MaraiDB may need to look for both exceptions.
Hmm, you might be right I will look into it tomorrow if I have the time
Yeah, I'm pretty sure that's what would happen. If you look at the DbObjectLock
class, it does an insert when it creates a new lock. However, its insert is not in a transaction and it throws a duplicate key exception on concurrent inserts. So, the code in the details db class needs to look for both exceptions.
Hey Peter!
I have now added the more proper checking for concurrency issues using the suggested method, along with refactoring.
I have added two tests locally, which check for almost concurrent issues, in which case MariaDB driver throws a "duplicate key" error instead of a "deadlock". I have tested that these tests pass on H2 and Postgres databases too, but I didn't add them in the commit, if you want to I can include them too, but I don't think it's necessary.
Hopefully things will work as they should now.
Here are the two tests I have added locally, they are almost the same as the normal tests you have, with the exception of adding a sleep call for 10 milliseconds
@Test
public void shouldSucceedWhenAlmostConcurrentAddAndSameDigest() throws InterruptedException, ExecutionException {
database = new ObjectDetailsDatabaseBuilder()
.waitTime(1000, TimeUnit.MILLISECONDS)
.dataSource(dataSource)
.tableName(tableName)
.build();
var inventory = basicInventory();
var invBytes = inventoryBytes(inventory);
var digest = DigestUtil.computeDigestHex(inventory.getDigestAlgorithm(), invBytes);
var phaser = new Phaser(2);
var future = executor.submit(() -> {
phaser.arriveAndAwaitAdvance();
database.addObjectDetails(inventory, digest, invBytes);
});
phaser.arriveAndAwaitAdvance();
TimeUnit.MILLISECONDS.sleep(10); //// SLEEP
database.addObjectDetails(inventory, digest, invBytes);
future.get();
var details = database.retrieveObjectDetails(inventory.getId());
assertObjectDetails(inventory, digest, invBytes, details);
}
@Test
public void shouldFailWhenAlmostConcurrentAddAndDifferentDigest() {
database = new ObjectDetailsDatabaseBuilder()
.waitTime(1, TimeUnit.SECONDS)
.dataSource(dataSource)
.tableName(tableName)
.build();
var inventory = basicInventory();
var invBytes = inventoryBytes(inventory);
var digest = DigestUtil.computeDigestHex(inventory.getDigestAlgorithm(), invBytes);
var inv2 = inventory.buildFrom().build();
var invBytes2 = inventoryBytes(inv2);
var digest2 = "bogus";
var phaser = new Phaser(3);
var future = executor.submit(() -> {
phaser.arriveAndAwaitAdvance();
database.addObjectDetails(inv2, digest2, invBytes2);
});
var future2 = executor.submit(() -> {
phaser.arriveAndAwaitAdvance();
try {
TimeUnit.MILLISECONDS.sleep(10); //// SLEEP
} catch (InterruptedException e) {
e.printStackTrace();
}
database.addObjectDetails(inventory, digest, invBytes);
});
phaser.arriveAndAwaitAdvance();
assertThrows(ObjectOutOfSyncException.class, () -> {
try {
future.get();
future2.get();
} catch (ExecutionException e) {
throw e.getCause();
}
});
}
This PR adds support for MariaDB databases to ocfl-java library. It is based on pre-existing classes and should not introduce any breaking changes. Database code was tested on our end, both with unit tests and manual testing on a live database, no errors were found during the testing phase, lock and out of sync errors functioned as expected.
Edits to pre-existing code are minor, as they only pertain to making SQL queries
protected
instead ofprivate
as MariaDB uses different queries for updating tables. The other queries were madeprotected
for consistency. For loop inDbTypes
was changed to a stream operation, and comments were added to JavaDoc and Readme to reflect the changes.