snowdrop / narayana-spring-boot

Narayana Spring Boot autoconfiguration and starter
Apache License 2.0
44 stars 43 forks source link

Is this expected behavior for 2 XA databases? #159

Open rinaldodev opened 3 weeks ago

rinaldodev commented 3 weeks ago

I have set up a reproducer that uses 2 PostgreSQL databases enlisted in a XA transaction. It is a modified version of narayana-spring-boot-sample-2pc. One of the databases has a UNIQUE column while the other doesn't. The reproducer is available here: https://github.com/rinaldodev/narayana-spring-boot/tree/2ds-2pc-test.

I see a behavior that I'm not sure if it's expected:

  1. PREPARE is called for db1 (non-unique), works fine (TwoPhaseOutcome.PREPARE_OK)
  2. PREPARE is called for db2 (unique), db returns constraint violation as expected (TwoPhaseOutcome.PREPARE_NOTOK)
  3. ROLLBACK PREPARED is called for db1, works fines (TwoPhaseOutcome.FINISH_OK)
  4. ROLLBACK PREPARED is called for db2, but fails because the prepared transaction doesn't exist (TwoPhaseOutcome.HEURISTIC_HAZARD)

Is step 4 expected to occur?

Steps to run the example:

  1. podman run --rm --name db1 -e POSTGRES_PASSWORD=password -p 5432:5432 docker.io/library/postgres:latest -c max_prepared_transactions=10
  2. podman run --rm --name db2 -e POSTGRES_PASSWORD=password -p 5433:5432 docker.io/library/postgres:latest -c max_prepared_transactions=10
  3. cd narayana-spring-boot-samples/narayana-spring-boot-sample-2pc
  4. mvn spring-boot:run -Ddisable.checks > out.log
  5. leave it running for a while (around 30 seconds used to be enough in my tests)

Trace logs are enabled so I suggest output to a file.

Here are some relevant parts from the log:

Executor ---> Attempting to create a user named Richard
XAResource 6fc88644: preparing transaction xid = < formatId=131077 ...
PREPARE TRANSACTION '131077_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAUDE=_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAUQAAAAAAAAAA'
BasicAction::doPrepare() result for action-id (0:ffff0a057e34:ac09:66ce254f:50) on record id: (0:ffff0a057e34:ac09:66ce254f:52) is (TwoPhaseOutcome.PREPARE_OK) node id: (1)
XAResource 7defacb6: preparing transaction xid = < formatId=131077 ...
PREPARE TRANSACTION '131077_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAUDE=_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAVAAAAAAAAAAA'
BasicAction::doPrepare() result for action-id (0:ffff0a057e34:ac09:66ce254f:50) on record id: (0:ffff0a057e34:ac09:66ce254f:55) is (TwoPhaseOutcome.PREPARE_NOTOK) node id: (1)
ARJUNA012073: BasicAction.End() - prepare phase of action-id 0:ffff0a057e34:ac09:66ce254f:50 failed.
XAResource 6fc88644: rolling back xid = < formatId=131077 ...
ROLLBACK PREPARED '131077_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAUDE=_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAUQAAAAAAAAAA'
BasicAction::doAbort() result for action-id (0:ffff0a057e34:ac09:66ce254f:50) on record id: (0:ffff0a057e34:ac09:66ce254f:52) is (TwoPhaseOutcome.FINISH_OK) node id: (1)
ROLLBACK PREPARED '131077_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAUDE=_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAVAAAAAAAAAAA'
ERROR: prepared transaction with identifier "131077_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAUDE=_AAAAAAAAAAAAAP//CgV+NAAArAlmziVPAAAAVAAAAAAAAAAA" does not exist
ARJUNA016045: attempted rollback of < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff0a057e34:ac09:66ce254f:50, node_name=1, branch_uid=0:ffff0a057e34:ac09:66ce254f:54, subordinatenodename=null, eis_name=unknown eis name > (org.postgresql.xa.PGXAConnection@7defacb6) failed with exception code XAException.XAER_RMERR
ARJUNA012089: Top-level abort of action 0:ffff0a057e34:ac09:66ce254f:50 received heuristic decision: TwoPhaseOutcome.HEURISTIC_HAZARD
BasicAction::End() result for action-id (0:ffff0a057e34:ac09:66ce254f:50) is (TwoPhaseOutcome.HEURISTIC_HAZARD) node id: (1)
rinaldodev commented 3 weeks ago

@marcosgopen

marcosgopen commented 3 weeks ago

Thanks @rinaldodev for your reproducer. As expected the db2 throws the exception 'ERROR: duplicate key value violates unique constraint "users_name_key"'. So the second prepare fails and the transaction is rolled back. And I can see that the rollback throws "XAException.XAER_RMERR" because the prepared transaction does not exist. I need to check if this is the expected behaviour. cc @graben

graben commented 3 weeks ago

@rinaldodev: Could you please look at what happens if you use Agroal DataSource instead of the Narayana unpooled one.

rinaldodev commented 3 weeks ago

@rinaldodev: Could you please look at what happens if you use Agroal DataSource instead of the Narayana unpooled one.

My original tests were with Agroal, that's when I saw it happening. I removed it from this reproducer just to reduce the scope. My original test is here: https://github.com/apache/camel-spring-boot-examples/pull/141

I can augment this one with Agroal if you believe it's necessary.

jmfinelli commented 3 weeks ago

Hi @rinaldodev, I think you bumped into JBTM-3843. There is an upstream PR (on hold) to fix this issue and a PR to fix/discuss how pgjdbc handles rollback invocations when there has been a constraint violation during the prepare phase. We'll keep you posted

rinaldodev commented 3 weeks ago

Hi @rinaldodev, I think you bumped into JBTM-3843. There is an upstream PR (on hold) to fix this issue and a PR to fix/discuss how pgjdbc handles rollback invocations when there has been a constraint violation during the prepare phase. We'll keep you posted

Thanks a lot. I'll follow this issue.

graben commented 3 weeks ago

@rinaldodev: Could you please look at what happens if you use Agroal DataSource instead of the Narayana unpooled one.

My original tests were with Agroal, that's when I saw it happening. I removed it from this reproducer just to reduce the scope. My original test is here: https://github.com/apache/camel-spring-boot-examples/pull/141

I can augment this one with Agroal if you believe it's necessary.

I think it's not necessary at the moment. Let's wait for the Narayana patch to be committed.