snowdrop / narayana-spring-boot

Narayana Spring Boot autoconfiguration and starter
Apache License 2.0
46 stars 43 forks source link

Redesign DataSourceXAResourceRecoveryHelper to be compatible with JBTM-3879 (based on org.jboss.narayana.jta.jms.JmsXAResourceRecoveryHelper) #149

Closed graben closed 4 months ago

graben commented 4 months ago

Works with actual Narayana 7.0.1 release and Narayana 7.0.2 (including JBTM-3879) #jbosstm/narayana/pull/2268

graben commented 4 months ago

Hi @geoand, this adjusted version of DataSourceXAResourceRecoveryHelper is needed to be compatible with JBTM-3879 changes in Narayana.

FYI @marcosgopen

geoand commented 4 months ago

So do we want to merge this?

graben commented 4 months ago

Well, I think the implementation is better than the current one, and it is needed if upgrading to newer Narayana versions should be possible. ;)

geoand commented 4 months ago

cc @mmusgrov

marcosgopen commented 4 months ago

Thanks @graben for your PR, it is greatly appreciated! Could you double check the PooledTransactionalIT test? It keeps failing with 7.0.2.Final: ''[ERROR] Errors: [ERROR] PooledRecoveryIT>GenericRecoveryIT.testCrashBeforeCommit:122 » ConditionTimeout Condition with alias 'Wait for the recovery to happen' didn't complete within 1 minutes because assertion condition with alias Wait for the recovery to happen [Test entry should exist after transaction was committed] Expected size: 1 but was: 0 in: []." I am going to have a look at that.

[UPDATE] the PooledRecoveryIT test fails with Narayana 7.0.2.Final. That is because of another issue (see https://issues.redhat.com/browse/JBTM-3838). In fact now Narayana is not treating XAER_RMFAIL as a transient failure and would consider it a heuristic aligning to xa_rollback specification (https://publications.opengroup.org/c193)

From the test log you can read: "WARN 170836 --- [nsaction Reaper] com.arjuna.ats.arjuna : ARJUNA012117: TransactionReaper::check processing TX 0:ffffc0a82b22:a64d:66840d2a:1e in state RUN"

This is a issue that might have already been solved by Artemis team (maybe https://github.com/apache/activemq-artemis/commit/269be13c5a). FYI @graben @tomjenkinson

mmusgrov commented 4 months ago

I got a failure after git cloning https://github.com/snowdrop/narayana-spring-boot and typing mvn clean install verify :

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-jar-plugin:3.3.0:jar (default-jar) on project narayana-spring-boot-starter-it: You have to use a classifier to attach supplemental artifacts to the project instead of replacing them. -> [Help 1]

and my env is

[mmusgrov@dev2 narayana-spring-boot] ((HEAD detached at pr/149)) $ ./mvnw -version Apache Maven 3.9.3 (21122926829f1ead511c958d89bd2f672198ae9f) Maven home: /home/mmusgrov/.m2/wrapper/dists/apache-maven-3.9.3-bin/326f10f4/apache-maven-3.9.3 Java version: 17.0.11, vendor: Red Hat, Inc., runtime: /usr/lib/jvm/java-17-openjdk-17.0.11.0.9-1.fc39.x86_64 Default locale: en_GB, platform encoding: UTF-8 OS name: "linux", version: "6.8.11-200.fc39.x86_64", arch: "amd64", family: "unix"

and it similarly fails running just the relevant test case:

[mmusgrov@dev2 narayana-spring-boot-starter-it] ((HEAD detached at pr/149)) $ pwd /home/mmusgrov/src/forks/spring/narayana-spring-boot/narayana-spring-boot-starter-it [mmusgrov@dev2 narayana-spring-boot-starter-it] ((HEAD detached at pr/149)) $ ../mvnw clean install verify -Dit.test=GenericRecoveryIT#testCrashBeforeCommit

although the actual test passes:

[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 32.77 s -- in dev.snowdrop.boot.narayana.generic.GenericRecoveryIT

tomjenkinson commented 4 months ago

In fact now Narayana is not treating XAER_RMFAIL as a transient failure and would consider it a heuristic aligning to xa_rollback specification (https://publications.opengroup.org/c193)

From the test log you can read: "WARN 170836 --- [nsaction Reaper] com.arjuna.ats.arjuna : ARJUNA012117: TransactionReaper::check processing TX 0:ffffc0a82b22:a64d:66840d2a:1e in state RUN"

This is a issue that might have already been solved by Artemis team (maybe apache/activemq-artemis@269be13c5a). FYI @graben @tomjenkinson

Thank you for providing the summary @marcosgopen

tomjenkinson commented 4 months ago

@mmusgrov FYI the CI steps for snowdrop/narayana-spring-boot (which might give a hint as to how it's meant to run at least in one permutation) are in https://github.com/snowdrop/narayana-spring-boot/blob/main/.github/workflows/test.yml#L23

tomjenkinson commented 4 months ago

@marcosgopen the two CI runs at this time seem to be successful - are you saying you consider it should be considered failing due to the WARN?

graben commented 4 months ago

Thanks @graben for your PR, it is greatly appreciated! Could you double check the PooledTransactionalIT test? It keeps failing with 7.0.2.Final: ''[ERROR] Errors: [ERROR] PooledRecoveryIT>GenericRecoveryIT.testCrashBeforeCommit:122 » ConditionTimeout Condition with alias 'Wait for the recovery to happen' didn't complete within 1 minutes because assertion condition with alias Wait for the recovery to happen [Test entry should exist after transaction was committed] Expected size: 1 but was: 0 in: []." I am going to have a look at that.

[UPDATE] the PooledRecoveryIT test fails with Narayana 7.0.2.Final. That is because of another issue (see https://issues.redhat.com/browse/JBTM-3838). In fact now Narayana is not treating XAER_RMFAIL as a transient failure and would consider it a heuristic aligning to xa_rollback specification (https://publications.opengroup.org/c193)

From the test log you can read: "WARN 170836 --- [nsaction Reaper] com.arjuna.ats.arjuna : ARJUNA012117: TransactionReaper::check processing TX 0:ffffc0a82b22:a64d:66840d2a:1e in state RUN"

This is a issue that might have already been solved by Artemis team (maybe https://github.com/apache/activemq-artemis/commit/269be13c5a). FYI @graben @tomjenkinson

Hi @marcosgopen,

the pooled test fails because of an issue in H2 database. I'm actually waiting for an enhancement to Agroal agroal/agroal#109 actually under review by @barreiro to do a proper workaround for this. The pooled test is not affected by this change. The helper is only used by non-pooled connections.

tomjenkinson commented 4 months ago

(sorry @marcosgopen now I see this is not upgrading to 7.0.2.Final :) )

marcosgopen commented 4 months ago

Thanks @graben for your PR, it is greatly appreciated! Could you double check the PooledTransactionalIT test? It keeps failing with 7.0.2.Final: ''[ERROR] Errors: [ERROR] PooledRecoveryIT>GenericRecoveryIT.testCrashBeforeCommit:122 » ConditionTimeout Condition with alias 'Wait for the recovery to happen' didn't complete within 1 minutes because assertion condition with alias Wait for the recovery to happen [Test entry should exist after transaction was committed] Expected size: 1 but was: 0 in: []." I am going to have a look at that. [UPDATE] the PooledRecoveryIT test fails with Narayana 7.0.2.Final. That is because of another issue (see https://issues.redhat.com/browse/JBTM-3838). In fact now Narayana is not treating XAER_RMFAIL as a transient failure and would consider it a heuristic aligning to xa_rollback specification (https://publications.opengroup.org/c193) From the test log you can read: "WARN 170836 --- [nsaction Reaper] com.arjuna.ats.arjuna : ARJUNA012117: TransactionReaper::check processing TX 0:ffffc0a82b22:a64d:66840d2a:1e in state RUN" This is a issue that might have already been solved by Artemis team (maybe apache/activemq-artemis@269be13c5a). FYI @graben @tomjenkinson

Hi @marcosgopen,

the pooled test fails because of an issue in H2 database. I'm actually waiting for an enhancement to Agroal agroal/agroal#109 actually under review by @barreiro to do a proper workaround for this. The pooled test is not affected by this change. The helper is only used by non-pooled connections.

You are right @graben . The pooled test failure is not related to this PR and not related to JBTM-3879 (XA recovery scan cursor handling). Having written my comment here is confusing so I will keep discussing about it in the Narayana update PR (https://github.com/snowdrop/narayana-spring-boot/pull/148)

graben commented 4 months ago

Yes and no. The change in Narayana by JBTM-3879 is causing that the first workaround for H2 is not working any more. It's a bit complicated and confusing that changes in several projects are causing "strange" behaviours. It's keeping me busy. :-D

graben commented 4 months ago

I got a failure after git cloning https://github.com/snowdrop/narayana-spring-boot and typing mvn clean install verify :

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-jar-plugin:3.3.0:jar (default-jar) on project narayana-spring-boot-starter-it: You have to use a classifier to attach supplemental artifacts to the project instead of replacing them. -> [Help 1]

and my env is

[mmusgrov@dev2 narayana-spring-boot] ((HEAD detached at pr/149)) $ ./mvnw -version Apache Maven 3.9.3 (21122926829f1ead511c958d89bd2f672198ae9f) Maven home: /home/mmusgrov/.m2/wrapper/dists/apache-maven-3.9.3-bin/326f10f4/apache-maven-3.9.3 Java version: 17.0.11, vendor: Red Hat, Inc., runtime: /usr/lib/jvm/java-17-openjdk-17.0.11.0.9-1.fc39.x86_64 Default locale: en_GB, platform encoding: UTF-8 OS name: "linux", version: "6.8.11-200.fc39.x86_64", arch: "amd64", family: "unix"

and it similarly fails running just the relevant test case:

[mmusgrov@dev2 narayana-spring-boot-starter-it] ((HEAD detached at pr/149)) $ pwd /home/mmusgrov/src/forks/spring/narayana-spring-boot/narayana-spring-boot-starter-it [mmusgrov@dev2 narayana-spring-boot-starter-it] ((HEAD detached at pr/149)) $ ../mvnw clean install verify -Dit.test=GenericRecoveryIT#testCrashBeforeCommit

although the actual test passes:

[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 32.77 s -- in dev.snowdrop.boot.narayana.generic.GenericRecoveryIT

Why calling install and verify at once? verify is called by install goal.

graben commented 4 months ago

@marcosgopen : Can you please make a review of this change set from a Narayana perspective. I think the guys able to merge will need expert advice. Thx!

graben commented 4 months ago

@geoand: Hope to get positive feedback by @marcosgopen soon, to get this merged.

tomjenkinson commented 4 months ago

Would anyone be able to share an easy way to run PooledRecoveryIT in a debugger, please?

marcosgopen commented 4 months ago

Would anyone be able to share an easy way to run PooledRecoveryIT in a debugger, please?

I would run ' mvn -Dmaven.failsafe.debug="-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=localhost:8000" install -Dit.test=PooledRecoveryIT' from the narayana-spring-boot-starter-it folder

marcosgopen commented 4 months ago

@marcosgopen : Can you please make a review of this change set from a Narayana perspective. I think the guys able to merge will need expert advice. Thx!

I am checking the PR. Do you @graben have any updates about the Agroal/H2 workaround you mentioned above?

tomjenkinson commented 4 months ago

Thank you @marcosgopen ! A few other things that might be relevant to share are to consider adding -Dmaven.surefire.skip=true -f narayana-spring-boot-starter-it/pom.xml and also I am using ./mvnw

Thank you again - debugging is working.

graben commented 4 months ago

@marcosgopen : Can you please make a review of this change set from a Narayana perspective. I think the guys able to merge will need expert advice. Thx!

I am checking the PR. Do you @graben have any updates about the Agroal/H2 workaround you mentioned above?

Has been merged right now by Luis.

graben commented 4 months ago

Hi @geoand, may you be able to merge this?

geoand commented 4 months ago

Let us know when you need a new release - cc @jacobdotcosta

graben commented 4 months ago

I'll do, but I think there will come an Agroal and Narayana release update before that. ;)

mmusgrov commented 4 months ago

@tomjenkinson and @graben thanks for your suggestions related to resolving my build issue.