treasure-data / digdag

Workload Automation System
https://www.digdag.io/
Apache License 2.0
1.3k stars 221 forks source link

connecting to redshift using pg operator fails #423

Closed cosmok closed 7 years ago

cosmok commented 7 years ago

I get 'Failed to connect to the database' while running the workflow at the bottom. I can connect to redshift via psql BTW.

timezone: Australia/Sydney _export: pg: host: host port: 5439 database: dbname user: username password: password ssl: false

+create_temp_interval_table: pg>: queries/create_temp_interval_table.sql

hiroyuki-sato commented 7 years ago

Hello @cosmok

I'm not familiar about the Redshift. But you may need to write database password in ~/.config/digdag/config. (When you use local mode).

If you use server mode, you need to setup secrets

For example

~/.config/digdag/config

secrets.pg.password = your_postgresql_password_here
digdag init -t postgresql pg_test
2016-12-14 12:52:37 +0900: Digdag v0.8.22
  Creating pg_test/queries/create_src_table.sql
  Creating pg_test/queries/insert_data_to_src_table.sql
  Creating pg_test/queries/summarize_src_table.sql
  Creating pg_test/pg_test.dig
  Creating pg_test/.gitignore

modify connection part.

_export:
  pg:
    user: digdag
    host: 127.0.0.1
    database: digdag_test
digdag run pg_test.dig -a

2016-12-14 12:56:12 +0900: Digdag v0.8.22
2016-12-14 12:56:14 +0900 [WARN] (main): Reusing the last session time 2016-12-14T00:00:00+00:00.
2016-12-14 12:56:14 +0900 [INFO] (main): Using session /private/tmp/pg_test/.digdag/status/20161214T000000+0000.
2016-12-14 12:56:14 +0900 [INFO] (main): Starting a new session project id=1 workflow name=pg_test session_time=2016-12-14T00:00:00+00:00
2016-12-14 12:56:15 +0900 [INFO] (0017@+pg_test+caution): echo>: This workflow can work with PostgreSQL running in your local machine. If you already set up PostgreSQL, please create database and user described in the above configuration
This workflow can work with PostgreSQL running in your local machine. If you already set up PostgreSQL, please create database and user described in the above configuration
2016-12-14 12:56:16 +0900 [INFO] (0017@+pg_test+create_src_table): pg>: queries/create_src_table.sql
2016-12-14 12:56:17 +0900 [INFO] (0017@+pg_test+create_src_table): pg>: queries/create_src_table.sql
2016-12-14 12:56:18 +0900 [INFO] (0017@+pg_test+insert_data_to_src_table): pg>: queries/insert_data_to_src_table.sql
2016-12-14 12:56:18 +0900 [INFO] (0017@+pg_test+insert_data_to_src_table): pg>: queries/insert_data_to_src_table.sql
2016-12-14 12:56:19 +0900 [INFO] (0017@+pg_test+summarize_src_table): pg>: queries/summarize_src_table.sql
2016-12-14 12:56:20 +0900 [INFO] (0017@+pg_test+summarize_src_table): pg>: queries/summarize_src_table.sql
Success. Task state is saved at /private/tmp/pg_test/.digdag/status/20161214T000000+0000 directory.
  * Use --session <daily | hourly | "yyyy-MM-dd[ HH:mm:ss]"> to not reuse the last session time.
  * Use --rerun, --start +NAME, or --goal +NAME argument to rerun skipped
komamitsu commented 7 years ago

@cosmok Can I ask you to show me the error message just in case? As @hiroyuki-sato said, your password of Redshift is needed to be stored in .config as secrets. (Thanks @hiroyuki-sato)

BTW, pg operator supports strict_transaction mode using SELECT FOR UPDATE by default which prepends duplicated INSERT INTO just in case. But Redshift doesn't support SELECT FOR UPDATE. So I think you need to disable strict_transaction mode like this

strict_transaction: false

Well, I should've added this option to the document... I'll do that.

FYI: We're developing redshift and redshift_load operators now which support Redshift properly.

cosmok commented 7 years ago

@hiroyuki-sato adding to the config file worked, Thanks!

But, could you please tell me how the queries in the template file are executed? I have a template file of the following format:

CREATE TEMP TABLE staging_accounting_interval ON COMMIT DROP AS SELECT *..;
BEGIN TRANSACTION;
-- trying to use the temp table
END TRANSACTION;
DROP TABLE  staging_accounting_interval;

but, am getting the following error:

ERROR: relation "staging_accounting_interval" does not exist

@komamitsu exact error:

error: 
  * +usage_calculate+create_temp_interval_table:
    Failed to connect to the database

which is from: https://github.com/treasure-data/digdag/blob/18d0e1bbfeff23083948dffcec4861fd132ad03c/digdag-standards/src/main/java/io/digdag/standards/operator/jdbc/AbstractJdbcConnectionConfig.java#L56

A more specific message like 'password is not valid' would be more helpful.

Thanks for the strict_transaction tip.

I am looking forward to use the redshift loader!

hiroyuki-sato commented 7 years ago

@cosmok

I change begin transaction and create temp order Maybe, You don't need drop table because it drops table automatically after end transaction.


BEGIN TRANSACTION;
CREATE TEMP TABLE staging_accounting_interval ON COMMIT DROP AS SELECT *..;
-- trying to use the temp table
END TRANSACTION;
-- DROP TABLE  staging_accounting_interval; 
cosmok commented 7 years ago

@hiroyuki-sato changing the order as suggested by you doesn't make a difference. I am not sure if each query gets a different sesssion?

hiroyuki-sato commented 7 years ago

Humm...

First of all. I checked. create table it drops table immediately if I don't use begin transaction.

Correct case

digdag_test=# BEGIN TRANSACTION;
BEGIN
digdag_test=# CREATE TEMP TABLE staging_accounting_interval ON COMMIT DROP AS SELECT * from summarized_data;
SELECT 5

Check table list, and I found staging_accounting_interval.

digdag_test=# \dt
                    List of relations
  Schema   |            Name             | Type  | Owner
-----------+-----------------------------+-------+--------
 pg_temp_3 | staging_accounting_interval | table | hsato
 public    | __digdag_status             | table | digdag
 public    | example_access_logs         | table | digdag
 public    | summarized_data             | table | digdag
(4 rows)

commit tran.

digdag_test=# commit transaction ;
COMMIT

After check table list again, staging_accounting_interval were dropped.

digdag_test=# \dt
               List of relations
 Schema |        Name         | Type  | Owner
--------+---------------------+-------+--------
 public | __digdag_status     | table | digdag
 public | example_access_logs | table | digdag
 public | summarized_data     | table | digdag
(3 rows)

Wrong case

digdag_test=# CREATE TEMP TABLE staging_accounting_interval ON COMMIT DROP AS SELECT * from summarized_data;
SELECT 5

check table. The staging_accounting_interval table already dropped.

digdag_test=# \dt
               List of relations
 Schema |        Name         | Type  | Owner
--------+---------------------+-------+--------
 public | __digdag_status     | table | digdag
 public | example_access_logs | table | digdag
 public | summarized_data     | table | digdag
(3 rows)
hiroyuki-sato commented 7 years ago

And It worked properly on my environment.

timezone: UTC

_export:
  pg:
    user: digdag
    host: 127.0.0.1
    database: digdag_test

+caution:
  echo>: This workflow can work with PostgreSQL running in your local machine.
         If you already set up PostgreSQL, please create database and user described in the above configuration

+query_test:
  pg>: queries/test.sql

queris/test.sql

BEGIN TRANSACTION;
CREATE TEMP TABLE staging_accounting_interval ON COMMIT DROP AS SELECT * from summarized_data;
-- trying to use the temp table
END TRANSACTION;
-- DROP TABLE  staging_accounting_interval;
digdag run -a pg_test
2016-12-14 14:53:00 +0900: Digdag v0.8.22
2016-12-14 14:53:02 +0900 [WARN] (main): Reusing the last session time 2016-12-14T00:00:00+00:00.
2016-12-14 14:53:02 +0900 [INFO] (main): Using session /private/tmp/pg_test/.digdag/status/20161214T000000+0000.
2016-12-14 14:53:02 +0900 [INFO] (main): Starting a new session project id=1 workflow name=pg_test session_time=2016-12-14T00:00:00+00:00
2016-12-14 14:53:03 +0900 [INFO] (0017@+pg_test+caution): echo>: This workflow can work with PostgreSQL running in your local machine. If you already set up PostgreSQL, please create database and user described in the above configuration
This workflow can work with PostgreSQL running in your local machine. If you already set up PostgreSQL, please create database and user described in the above configuration
2016-12-14 14:53:04 +0900 [INFO] (0017@+pg_test+query_test): pg>: queries/test.sql
2016-12-14 14:53:05 +0900 [INFO] (0017@+pg_test+query_test): pg>: queries/test.sql
Success. Task state is saved at /private/tmp/pg_test/.digdag/status/20161214T000000+0000 directory.
  * Use --session <daily | hourly | "yyyy-MM-dd[ HH:mm:ss]"> to not reuse the last session time.
  * Use --rerun, --start +NAME, or --goal +NAME argument to rerun skipped tasks.
hiroyuki-sato commented 7 years ago

I commented DROP TABLE staging_accounting_interval; by the way. And I tested on PostgreSQL 9.6.1. If you can't solve this issue, can I ask you to show me the error message and configuration?

cosmok commented 7 years ago

@hiroyuki-sato The error hasn't gone away. BTW the same script works when I fed it to psql CLI, so must be how pg operator operates or could be a redshift thing.

Config:

+create_temp_interval_table:
   pg>: ./queries/create_temp_interval_table.sql

./queries/create_temp_interval_table.sql

BEGIN TRANSACTION;
CREATE TEMP TABLE staging_accounting_interval AS
    SELECT * FROM ...;

DELETE FROM accounting_interval
    USING staging_accounting_interval
    WHERE ...;
INSERT INTO accounting_interval SELECT * FROM staging_accounting_interval;
END TRANSACTION;

Error:

2016-12-15` 09:51:37 +1100: Digdag v0.8.22
2016-12-15 09:51:40 +1100 [WARN] (main): Reusing the last session time 2016-12-14T00:00:00+11:00.
2016-12-15 09:51:40 +1100 [INFO] (main): Using session /home/k7/workspace/digdag/billing/.digdag/status/20161214T000000+1100.
2016-12-15 09:51:40 +1100 [INFO] (main): Starting a new session project id=1 workflow name=usage_calculate_load_to_rds session_time=2016-12-14T00:00:00+11:00
2016-12-15 09:51:41 +1100 [INFO] (0017@+usage_calculate_load_to_rds+create_temp_interval_table): pg>: ./queries/create_temp_interval_table.sql
2016-12-15 09:51:42 +1100 [INFO] (0017@+usage_calculate_load_to_rds+create_temp_interval_table): pg>: ./queries/create_temp_interval_table.sql
2016-12-15 09:51:43 +1100 [ERROR] (0017@+usage_calculate_load_to_rds+create_temp_interval_table): Configuration error at task +usage_calculate_load_to_rds+create_temp_interval_table: Given query is invalid (config)
> ERROR: relation "staging_accounting_interval" does not exist (psql)
2016-12-15 09:51:43 +1100 [INFO] (0017@+usage_calculate_load_to_rds^failure-alert): type: notify
error: 
  * +usage_calculate_load_to_rds+create_temp_interval_table:
    Given query is invalid
frsyuki commented 7 years ago

What happens with having only one single statement in the file? Did you try other queries such as SELECT 1? Did it work? What other things did you try? What things didn't work and what messages you got? Here is the location where the error is happening: https://github.com/treasure-data/digdag/blob/9ebb1efcc426915adaba33496f4f8dbedfa045dd/digdag-standards/src/main/java/io/digdag/standards/operator/jdbc/AbstractJdbcOperator.java#L115-L118 where getting results from https://github.com/treasure-data/digdag/blob/9ebb1efcc426915adaba33496f4f8dbedfa045dd/digdag-standards/src/main/java/io/digdag/standards/operator/jdbc/AbstractJdbcConnection.java#L58-L75

komamitsu commented 7 years ago

@frsyuki The comment of java.sql.Connection#prepareStatement says,

This method is optimized for handling parametric SQL statements that benefit from precompilation. If the driver supports precompilation, the method prepareStatement will send the statement to the database for precompilation.

PostgreSQL's prepared statement looks a server-side object https://www.postgresql.org/docs/current/static/sql-prepare.html. So in the case of this issue, validation of query fails since staging_accounting_interval doesn't exists on server-side. https://github.com/treasure-data/digdag/blob/master/digdag-standards/src/main/java/io/digdag/standards/operator/jdbc/AbstractJdbcOperator.java#L115-L118

Also, the preflight validation doesn't seem so useful. Do you think it's okay to remove that?

komamitsu commented 7 years ago

^ I reproduced this issue with PostgreSQL. And confirmed that removing the preflight validation of SQL resolved the issue.

cosmok commented 7 years ago

ah, now it make sense. to summarize: a combination of multiple statements that include a TEMP table inside a pg template sql file will error.

komamitsu commented 7 years ago

Replacing java.sql.Connection#prepareStatement with nativeSQL would be a better solution. That's a local side validation. I'll examine it

cosmok commented 7 years ago

Thanks everyone, I am now closing this issue as all issues are resolved.

komamitsu commented 7 years ago

@cosmok Thanks for the reporting!

frsyuki commented 7 years ago

@komamitsu that's a very good catch! @cosmok thank you for reporting the fix 👍