apache / hop

Hop Orchestration Platform
https://hop.apache.org/
Apache License 2.0
972 stars 349 forks source link

[Bug]: Trying to use generic db with duckdb driver 0.9.1 #3354

Open JonathanLoscalzo opened 1 year ago

JonathanLoscalzo commented 1 year ago

Apache Hop version?

2.6.0

Java version?

Openjdk version "11.0.20.1" 2023-08-24

Operating system

Linux

What happened?

Hi!

I want to use hop with duckdb. The hop-duckdb integration has duckdb's v0.7, but I'd like to use the 0.9.1 (latest).

Trying to configure the Generic Database driver and I am not able to read the driver from the folders /hop/lib/ or HOP_SHARED_JDBC_FOLDER/.

It always shows me the following view: Captura desde 2023-11-04 13-53-27

and the error is:

Error connecting to database [dwh] :org.apache.hop.core.exception.HopDatabaseException: 
Error occurred while trying to connect to the database

Driver class 'org.duckdb.DuckDBDriver' could not be found, make sure the 'Generic database' driver (jar file) is installed.
Unable to load class 'org.duckdb.DuckDBDriver' in this classloader or in the parent

org.apache.hop.core.exception.HopDatabaseException: 
Error occurred while trying to connect to the database

Driver class 'org.duckdb.DuckDBDriver' could not be found, make sure the 'Generic database' driver (jar file) is installed.
Unable to load class 'org.duckdb.DuckDBDriver' in this classloader or in the parent

As a final note, putting the same driver in pdi at /pdi/lib ~works correctly.~ sometimes shows the information and sometimes no. I think that is a lock issue in pdi. BTW, I want to use Hop.

Captura desde 2023-11-04 14-02-45

It could be some "colliding names"?

Issue Priority

Priority: 3

Issue Component

Component: Database

hansva commented 1 year ago

you can just drop in the 9.X version under plugins/databases/duckdb/lib. It will be upgraded in the next release #3317

JonathanLoscalzo commented 1 year ago

@hansva I have just removed the v0.7, then paste the 0.9 and it works.

But using the "Dimension Lookup/update" step it does not work with duckdb:

ERROR: Because of an error this transform can't continue: offending row : [payment_id Integer(9)], [date_from Date], [date_to Date]

Error setting value #2 [Date] on prepared statement setTimestamp ERROR: org.apache.hop.core.exception.HopDatabaseException: offending row : [payment_id Integer(9)], [date_from Date], [date_to Date]

Error setting value #2 [Date] on prepared statement setTimestamp at org.apache.hop.core.database.Database.setValues(Database.java:922) at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookup.lookupValues(DimensionLookup.java:437) at org.apache.hop.pipeline.transforms.dimensionlookup.DimensionLookup.processRow(DimensionLookup.java:204) at org.apache.hop.pipeline.transform.RunThread.run(RunThread.java:55) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.apache.hop.core.exception.HopDatabaseException: Error setting value #2 [Date] on prepared statement setTimestamp at org.apache.hop.core.row.value.ValueMetaBase.setPreparedStatementValue(ValueMetaBase.java:5693) at org.apache.hop.core.database.Database.setValue(Database.java:905) at org.apache.hop.core.database.Database.setValues(Database.java:920) ... 4 more Caused by: java.sql.SQLFeatureNotSupportedException: setTimestamp at org.duckdb.DuckDBPreparedStatement.setTimestamp(DuckDBPreparedStatement.java:770) at org.apache.hop.core.row.value.ValueMetaBase.setPreparedStatementValue(ValueMetaBase.java:5644) ... 6 more


- if a field is a timestamp, the error is the same. 

The source data has the following ddl (postgres)

CREATE TABLE public.payment ( payment_id serial4 NOT NULL, customer_id int2 NOT NULL, staff_id int2 NOT NULL, rental_id int4 NOT NULL, amount numeric(5, 2) NOT NULL, payment_date timestamp NOT NULL, CONSTRAINT payment_pkey PRIMARY KEY (payment_id) );


and the steps in the pipeline are ([hpl here](https://github.com/apache/hop/files/13257575/example.zip)): 
![Captura desde 2023-11-04 16-37-06](https://github.com/apache/hop/assets/6944598/084facdb-fd18-4ba7-951b-c801efe1a53d)
hansva commented 1 year ago

the generate sql is just a suggestion I honestly never use it. we can take a look at it

JonathanLoscalzo commented 1 year ago

the generate sql is just a suggestion I honestly never use it. we can take a look at it

No problem with the generated sql.

Looking more in depth into logging, I think that the issue is how the query in "Lookup/update" is being prepared:

lookup row : [123158], [2023/11/04 18:17:19.143], [2023/11/04 18:17:19.143]
ERROR: Because of an error this transform can't continue: 
offending row : [payment_id Integer(9)], [date_from Date], [date_to Date]

I guess that duckdb (I don't know other db's) needs the format "YYYY-MM-DD HH:mm:ss.nnn". I hope it helps

hansva commented 1 year ago

Interesting, thanks for the feedback. I'll try and take a look at it asap so we can include the fixes in our upcoming release. Should be for the very near future.

hansva commented 1 year ago

.take-issue