embulk / embulk-input-jdbc

MySQL, PostgreSQL, Redshift and generic JDBC input plugins for Embulk
Other
102 stars 74 forks source link

jdbc: Explicit rollback before closing connections #258

Closed ackintosh closed 6 months ago

ackintosh commented 6 months ago

Removed the hardcoded autoCommit setting so that users can set it via YAML configuration like this:

Added an explicit rollback before closing connections to avoid transaction errors as mentioned here.

ackintosh commented 6 months ago

@dmikurube I'm tryling the JDBC input plugin with IBM DB2, but I ended up with the following error message:

The transaction remains active, and the connection cannot be closed. ERRORCODE=-4471, SQLSTATE=null

2024-05-22 02:34:53.995 +0000 [INFO] (main): Started Embulk v0.11.1
2024-05-22 02:34:55.224 +0000 [INFO] (0001:transaction): Gem's home and path are set by system configs "gem_home": "/root/.embulk/lib/gems", "gem_path": ""
2024-05-22 02:34:55.637 +0000 [INFO] (0001:transaction): Loaded JRuby runtime 9.4.5.0
2024-05-22 02:34:55.662 +0000 [INFO] (0001:transaction): Loaded plugin embulk/input/jdbc from a load path
2024-05-22 02:34:55.734 +0000 [INFO] (0001:transaction): Loaded plugin embulk-output-file
2024-05-22 02:34:56.031 +0000 [INFO] (0001:transaction): Using JDBC Driver 4.23.42
2024-05-22 02:34:57.097 +0000 [INFO] (0001:cleanup): Gem's home and path are set by system configs "gem_home": "/root/.embulk/lib/gems", "gem_path": ""
2024-05-22 02:34:57.459 +0000 [INFO] (0001:cleanup): Loaded JRuby runtime 9.4.5.0
2024-05-22 02:34:57.481 +0000 [INFO] (0001:cleanup): Loaded plugin embulk/input/jdbc from a load path
2024-05-22 02:34:57.538 +0000 [INFO] (0001:cleanup): Loaded plugin embulk-output-file
org.embulk.exec.PartialExecutionException: java.lang.RuntimeException: com.ibm.db2.jcc.am.SqlException: [jcc][t4][10251][10308][4.23.42] java.sql.Connection.close() requested while a transaction is in progress on the connection.
The transaction remains active, and the connection cannot be closed. ERRORCODE=-4471, SQLSTATE=null
        at org.embulk.exec.BulkLoader$LoaderState.buildPartialExecuteException(BulkLoader.java:340)
        at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:580)
        at org.embulk.exec.BulkLoader.access$000(BulkLoader.java:36)
        at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:353)
        at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:350)
        at org.embulk.spi.ExecInternal.doWith(ExecInternal.java:26)
        at org.embulk.exec.BulkLoader.run(BulkLoader.java:350)
        at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:278)
        at org.embulk.EmbulkRunner.runInternal(EmbulkRunner.java:288)
        at org.embulk.EmbulkRunner.run(EmbulkRunner.java:153)
        at org.embulk.cli.EmbulkRun.runInternal(EmbulkRun.java:108)
        at org.embulk.cli.EmbulkRun.run(EmbulkRun.java:24)
        at org.embulk.cli.Main.main(Main.java:53)
        Suppressed: java.lang.NullPointerException
                at org.embulk.exec.BulkLoader.doCleanup(BulkLoader.java:477)
                at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:411)
                at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:408)
                at org.embulk.spi.ExecInternal.doWith(ExecInternal.java:26)
                at org.embulk.exec.BulkLoader.cleanup(BulkLoader.java:408)
                at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:283)
                ... 5 more
Caused by: java.lang.RuntimeException: com.ibm.db2.jcc.am.SqlException: [jcc][t4][10251][10308][4.23.42] java.sql.Connection.close() requested while a transaction is in progress on the connection.
The transaction remains active, and the connection cannot be closed. ERRORCODE=-4471, SQLSTATE=null
        at org.embulk.input.jdbc.AbstractJdbcInputPlugin.transaction(AbstractJdbcInputPlugin.java:228)
        at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:521)
        ... 11 more
Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][10251][10308][4.23.42] java.sql.Connection.close() requested while a transaction is in progress on the connection.
The transaction remains active, and the connection cannot be closed. ERRORCODE=-4471, SQLSTATE=null
        at com.ibm.db2.jcc.am.ld.a(ld.java:794)
        at com.ibm.db2.jcc.am.ld.a(ld.java:66)
        at com.ibm.db2.jcc.am.ld.a(ld.java:133)
        at com.ibm.db2.jcc.am.Connection.checkForTransactionInProgress(Connection.java:1480)
        at com.ibm.db2.jcc.t4.b.checkForTransactionInProgress(b.java:7405)
        at com.ibm.db2.jcc.am.Connection.closeResourcesX(Connection.java:1503)
        at com.ibm.db2.jcc.am.Connection.closeX(Connection.java:1489)
        at com.ibm.db2.jcc.am.Connection.close(Connection.java:1466)
        at org.embulk.input.jdbc.JdbcInputConnection.close(JdbcInputConnection.java:210)
        at org.embulk.input.jdbc.AbstractJdbcInputPlugin.transaction(AbstractJdbcInputPlugin.java:224)
        ... 12 more

Error: java.lang.RuntimeException: com.ibm.db2.jcc.am.SqlException: [jcc][t4][10251][10308][4.23.42] java.sql.Connection.close() requested while a transaction is in progress on the connection.
The transaction remains active, and the connection cannot be closed. ERRORCODE=-4471, SQLSTATE=null

In DB2, we need to commit or rollback the transaction before closing the connection even for a single SELECT query. As far as I have investigated, this plugin doesn't handle it. This is the problem I'm facing.

One way to resolve the error is to enable auto-commit so that we don't need to commit or rollback transactions manually. However, currently connection.setAutoCommit(false); is hardcoded in this plugin. This is why I filed this PR.


I just realized that changing this plugin to do commit transactions might be ideal from the viewpoint of resolving the problem directly and better compatibility, but I'm not sure if it's possible since I'm not familiar with the implementation. 🤔

dmikurube commented 6 months ago

Thanks for your response. Got it. In fact, embulk-input-db2 was maintained here in the past, but it was removed from this at #223 because we were unable to keep maintaining it (especially, testing it) at that point.

You may want to check the removed code to see how it was working.

Cc: @hito4t

hiroyuki-sato commented 6 months ago

JFYI: The transferred repository is https://github.com/hito4t/embulk-input-jdbc-external

Maybe you can try embulk-input-db2 with embulk 0.9.23 And additional info. (Japanese) https://x.com/hito4_t/status/1562697661048049664

ackintosh commented 6 months ago

@dmikurube Thanks for the pointer, it helped me a lot!

I've updated this PR and checked that it works fine locally. I would appreciate it if you could have a look at it.

ackintosh commented 6 months ago

Also updated the title of this PR.

hiroyuki-sato commented 6 months ago

In my opinion, d005393 is another hardcode. Someone probably wants to use explicit commit instead of rollback. It is better to keep current behavior by default, and other behavior implements as optional.

rough ideas.

in:
  type: jdbc
  commit_type: none # auto, none, commit, rollback (default none)
in:
  type: jdbc
  auto_comit: false 
  explicit_commit: # none, commit, rollback
ackintosh commented 6 months ago

I have decided not to use this plugin for some reasons. I am closing this, but it's okay if anyone wants to reopen or merge it if needed.

@dmikurube @hiroyuki-sato Thanks for your prompt response, which was so helpful. 🙏