embulk / embulk-output-jdbc

MySQL, PostgreSQL, Redshift and generic JDBC output plugins for Embulk
Other
88 stars 86 forks source link

Support UUID for embulk-output-postgresql #305

Open gainings opened 2 years ago

gainings commented 2 years ago

Plugin not support postgres UUID datatype now. UUID have become generally used, so I'd be happy if you could support them.

2022-02-12 12:40:27.749 +0900: Embulk v0.9.23
--
2022-02-12 12:40:30.771 +0900 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2022-02-12 12:40:42.055 +0900 [INFO] (main): Gem's home and path are set by default: "/root/.embulk/lib/gems"
2022-02-12 12:40:53.358 +0900 [INFO] (main): Started Embulk v0.9.23
2022-02-12 12:40:53.861 +0900 [INFO] (0001:transaction): Loaded plugin embulk-input-postgresql (0.12.3)
2022-02-12 12:40:54.060 +0900 [INFO] (0001:transaction): Loaded plugin embulk-output-postgresql (0.10.2)
2022-02-12 12:40:54.345 +0900 [INFO] (0001:transaction): Loaded plugin embulk-filter-mask (0.2.1)
...

org.embulk.exec.PartialExecutionException: java.lang.UnsupportedOperationException: Unsupported type UUID (sqlType=1111, size=2147483647, scale=0)
at org.embulk.exec.BulkLoader$LoaderState.buildPartialExecuteException(BulkLoader.java:340)
at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:566)
at org.embulk.exec.BulkLoader.access$000(BulkLoader.java:35)
at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:353)
at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:350)
at org.embulk.spi.Exec.doWith(Exec.java:22)
at org.embulk.exec.BulkLoader.run(BulkLoader.java:350)
at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:242)
at org.embulk.EmbulkRunner.runInternal(EmbulkRunner.java:291)
at org.embulk.EmbulkRunner.run(EmbulkRunner.java:155)
at org.embulk.cli.EmbulkRun.runSubcommand(EmbulkRun.java:431)
at org.embulk.cli.EmbulkRun.run(EmbulkRun.java:90)
at org.embulk.cli.Main.main(Main.java:64)
Suppressed: java.lang.NullPointerException
at org.embulk.exec.BulkLoader.doCleanup(BulkLoader.java:463)
at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:397)
at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:394)
at org.embulk.spi.Exec.doWith(Exec.java:22)
at org.embulk.exec.BulkLoader.cleanup(BulkLoader.java:394)
at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:245)
... 5 more
Caused by: java.lang.UnsupportedOperationException: Unsupported type UUID (sqlType=1111, size=2147483647, scale=0)
at org.embulk.output.jdbc.setter.ColumnSetterFactory.unsupportedOperationException(ColumnSetterFactory.java:198)
at org.embulk.output.jdbc.setter.ColumnSetterFactory.newCoalesceColumnSetter(ColumnSetterFactory.java:180)
at org.embulk.output.postgresql.setter.PostgreSQLColumnSetterFactory.newCoalesceColumnSetter(PostgreSQLColumnSetterFactory.java:26)
at org.embulk.output.jdbc.setter.ColumnSetterFactory.newColumnSetter(ColumnSetterFactory.java:42)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin.newColumnSetters(AbstractJdbcOutputPlugin.java:842)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin.doBegin(AbstractJdbcOutputPlugin.java:634)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin$2.run(AbstractJdbcOutputPlugin.java:491)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin$RetryableSQLExecution.call(AbstractJdbcOutputPlugin.java:1343)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin$RetryableSQLExecution.call(AbstractJdbcOutputPlugin.java:1331)
at org.embulk.util.retryhelper.RetryExecutor.run(RetryExecutor.java:109)
at org.embulk.util.retryhelper.RetryExecutor.runInterruptible(RetryExecutor.java:90)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin.withRetry(AbstractJdbcOutputPlugin.java:1309)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin.withRetry(AbstractJdbcOutputPlugin.java:1301)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin.begin(AbstractJdbcOutputPlugin.java:485)
at org.embulk.output.jdbc.AbstractJdbcOutputPlugin.transaction(AbstractJdbcOutputPlugin.java:461)
at org.embulk.exec.BulkLoader$4$1$1.transaction(BulkLoader.java:521)
at org.embulk.exec.LocalExecutorPlugin.transaction(LocalExecutorPlugin.java:50)
at org.embulk.exec.BulkLoader$4$1.run(BulkLoader.java:516)
at org.embulk.spi.util.Filters$RecursiveControl.transaction(Filters.java:84)
at org.embulk.spi.util.Filters$RecursiveControl$1.run(Filters.java:80)
at org.embulk.filter.mask.MaskFilterPlugin.transaction(MaskFilterPlugin.java:62)
at org.embulk.spi.util.Filters$RecursiveControl.transaction(Filters.java:76)
at org.embulk.spi.util.Filters.transaction(Filters.java:42)
at org.embulk.exec.BulkLoader$4.run(BulkLoader.java:511)
at org.embulk.input.jdbc.AbstractJdbcInputPlugin.transaction(AbstractJdbcInputPlugin.java:231)
at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:507)
... 11 more
Error: java.lang.UnsupportedOperationException: Unsupported type UUID (sqlType=1111, size=2147483647, scale=0)
hiroyuki-sato commented 2 years ago

Hello, @gainings. Could you try the following configuration? If this configuration does not match your requirements, please write detail for your requirements.

uuid.txt

1,AD06F94F-829F-4CF9-A6AD-04EE3714E02E
2,E5504F33-C412-49BA-BF73-0218BF572535
3,6333DC4E-2173-4916-B48A-C970F720CCBB
4,BCBF4DC2-EE75-4365-ADC3-94F224644B6F
5,80AD6583-A2E6-4A8F-A65B-9CAEBA000084
6,2AC18E27-91F0-4419-8A7F-007D6BD5920E
7,B8013A05-01D7-4ABB-A11D-2210A59B5486
8,6B32F95F-498C-4DD0-A79C-877078F749DB
9,21F9FB93-7786-426B-BBB8-806D1E092E0B
10,770C4DA4-E96C-4097-A71B-26C33A7904D0
in:
  type: file
  path_prefix: ./uuid.txt
  parser:
    type: csv
    columns:
    - { name: id, type: long }
    - { name: uuid, type: string }
out:
  type: postgresql
  host: localhost
  user: username
  password: password
  database: database
  table: uuid_test
  mode: insert
  column_options:
    uuid: { type: uuid, value_type: string }
gainings commented 2 years ago

@hiroyuki-sato thanks to reply!

In my environment, multiple configuration files use the same output config file with the include syntax

ex

{% assign table_name = 'foo_var' %}
{% include 'commons/in' with table_name %}

filters:
  - type: mask
    columns:
      - { name: name }
{% include 'commons/out' with table_name %}

In this situation, it's not possible to set the column_options of each table , so I would like an UUID support or any option that can be specified generically.

If that seems difficult, I'll change the way how to use the configuration file.

hiroyuki-sato commented 2 years ago

Helo, @gainings

You can use this branch until introduced new feature. It worked in my environment (I checked the string data only). diff This change requires more work(add test and discuss to others) for adding this branch.

Please let me know if you want to build it.