apache / hop

Hop Orchestration Platform
https://hop.apache.org/
Apache License 2.0
984 stars 352 forks source link

[Bug]: Table output MYSQL intermittent errors #3166

Open Givo29 opened 1 year ago

Givo29 commented 1 year ago

Apache Hop version?

2.3.0 - 2.5.0

Java version?

Openjdk version 11.0.17

Operating system

Linux

What happened?

Mysql database version: 8.0.21 Mysql jar version: mysql-connector-java-8.0.21.jar

Intermittently, our pipelines will fail with mysql errors when loading data into mysql with the following error message, this is the case for multiple versions of apache hop and multiple versions of the mysql connector. The pipeline will work correctly 8/10 times, but 2/10 it will fail and stop our data from syncing correctly.

This bug is also preventing us from moving more data jobs to Apache Hop.

2023/08/22 10:06:45 - copy_ad_mstr.0 - Connected to database [dbw_warehouse] (commit=1000)
2023/08/22 10:07:00 - copy_ad_mstr.0 - ERROR: Because of an error, this transform can't continue:
2023/08/22 10:07:00 - copy_ad_mstr.0 - ERROR: org.apache.hop.core.exception.HopException: 
2023/08/22 10:07:00 - copy_ad_mstr.0 - Error batch inserting rows into table [copy_ad_mstr].
2023/08/22 10:07:00 - copy_ad_mstr.0 - Errors encountered (first 10):
2023/08/22 10:07:00 - copy_ad_mstr.0 - 
2023/08/22 10:07:00 - copy_ad_mstr.0 - 
2023/08/22 10:07:00 - copy_ad_mstr.0 - Error updating batch
2023/08/22 10:07:00 - copy_ad_mstr.0 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''EXT' at line 1
2023/08/22 10:07:00 - copy_ad_mstr.0 - 
2023/08/22 10:07:00 - copy_ad_mstr.0 - 
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at org.apache.hop.pipeline.transforms.tableoutput.TableOutput.writeToTable(TableOutput.java:340)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at org.apache.hop.pipeline.transforms.tableoutput.TableOutput.processRow(TableOutput.java:117)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at org.apache.hop.pipeline.transform.RunThread.run(RunThread.java:55)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at java.base/java.lang.Thread.run(Thread.java:829)
2023/08/22 10:07:00 - copy_ad_mstr.0 - Caused by: org.apache.hop.core.exception.HopDatabaseBatchException: 
2023/08/22 10:07:00 - copy_ad_mstr.0 - Error updating batch
2023/08/22 10:07:00 - copy_ad_mstr.0 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''EXT' at line 1
2023/08/22 10:07:00 - copy_ad_mstr.0 - 
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at org.apache.hop.core.database.Database.createHopDatabaseBatchException(Database.java:1296)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at org.apache.hop.pipeline.transforms.tableoutput.TableOutput.writeToTable(TableOutput.java:285)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  ... 3 more
2023/08/22 10:07:00 - copy_ad_mstr.0 - Caused by: java.sql.BatchUpdateException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''EXT' at line 1
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.util.Util.handleNewInstance(Util.java:192)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.util.Util.getInstance(Util.java:167)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.util.Util.getInstance(Util.java:174)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.exceptions.SQLError.createBatchUpdateException(SQLError.java:224)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.ClientPreparedStatement.executeBatchSerially(ClientPreparedStatement.java:853)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.ClientPreparedStatement.executeBatchInternal(ClientPreparedStatement.java:435)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.StatementImpl.executeBatch(StatementImpl.java:796)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at org.apache.hop.pipeline.transforms.tableoutput.TableOutput.writeToTable(TableOutput.java:281)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  ... 3 more
2023/08/22 10:07:00 - copy_ad_mstr.0 - Caused by: java.sql.SQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''EXT' at line 1
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1092)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  at com.mysql.cj.jdbc.ClientPreparedStatement.executeBatchSerially(ClientPreparedStatement.java:832)
2023/08/22 10:07:00 - copy_ad_mstr.0 -  ... 6 more

Issue Priority

Priority: 2

Issue Component

Component: Hop Gui, Component: Hop Run

hansva commented 1 year ago

can you isolate an example/reproduction path for us to look at? It seems that in some way you are breaking outside of the SQL command, there must be something in the data causing this.

Givo29 commented 1 year ago

Thanks for the reply :)

I'm afraid I don't quite understand what you're looking for though, what do you mean by reproduction path?

hansva commented 1 year ago

A pipeline that I can run that generates this error. Simplest form would be a data grid and table output. With the data grid containing the data that causes that error

Givo29 commented 1 year ago

Hey @hansva,

I've been trying to reproduce this issue for you, even so far as to dump the whole table into a csv and using a csv input step for you to replicate it, but I can only get it to fail using the table input step, which is strange.

Any ideas what this could be?

hansva commented 1 year ago

Hi @Givo29 ,

What you can try to do but it will make your pipeline a bit slower, is set the commit size on the table output to 1. And attach error handling to the table output and write this data to a file. Then you will know the exact line(s) that are causing the issue.

and then you might be able to generate insert into statements based on that data (or a dump). If the issue is only present when starting from a mysql table.

Hope this helps, Hans

Givo29 commented 1 year ago

I'm thinking it's possibly some sort of race condition. I've now put a blocking step between the table input and output and it seems to have stopped the intermittent issue.

Would a race condition of sorts be possible here?

image