confluentinc / kafka-connect-jdbc

Kafka Connect connector for JDBC-compatible databases
Other
21 stars 960 forks source link

db.timezone: issue: #950

Open sajita opened 4 years ago

sajita commented 4 years ago

I am using confluent version 5.5.1. It is said that by default confluent uses UTC timezone. I need to sink data in Kafka topic using timezone UTC but Postgres source connector takes server timezone and update data accordingly. I have tried by using db.timezone: UTC and timestamp.delay.interval.ms in source configuration but none of them seems working.

enesyalcin commented 3 years ago

Would you mind to post your connector configuration?

sajita commented 3 years ago

this is my connector configuration:

{ "name": "test", "config": { "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "timestamp.column.name": "updatedAt", "incrementing.column.name": "id", "transforms.createKey.type": "org.apache.kafka.connect.transforms.ValueToKey", "connection.password": "*****", "validate.non.null": "true", "catalog.pattern": "test", "table.whitelist": "test1", "mode": "timestamp+incrementing", "key.converter.schemas.enable": "false", "topic.prefix": "test", "connection.user": "abc", "value.converter.schemas.enable": "false", "connection.url": "jdbc:postgresql://localhost:5432/test", "value.converter": "org.apache.kafka.connect.json.JsonConverter", "key.converter": "org.apache.kafka.connect.converters.IntegerConverter", "transforms": "createKey,extractInt", "transforms.extractInt.type": "org.apache.kafka.connect.transforms.ExtractField$Key", "transforms.extractInt.field": "id", "transforms.createKey.fields": "id" } }

enesyalcin commented 3 years ago

How did you configure your db.timezone? Mine looks like this: "db.timezone": "Europe/Vienna".

Did I understand correctly that you would like to use the Source Connector like in your given example to fetch data from the Postgres DB into your topic. It works but the given timestamp is in the DB's timezone and you would like to have displayed your current timezone?

May I ask you to also upload an example on how the Connector updates the timezone when it fetches it from your DB/Table?

sajita commented 3 years ago

I also configured in the same way as you did: "db.timezone": "UTC"

This is my actual data in Postgres. Screenshot from 2020-11-23 15-49-36

My server timezone is in Asia/Kathmandu and it updates it accordingly like this:

Screenshot from 2020-11-23 15-56-25

PLease see the createdAt and updatedAt

enesyalcin commented 3 years ago

Okay I see your point. Would you mind updating your configuration to "db.timezone": "Asia/Kathmandu".

So my two cents to the technical implementation and why I think the updated timezone should solve your problem.

Postgres is saving the timezone in an epoch time format. This epoch time is for everyone, worldwide, the same but to our convenience, Postgres is mapping it to our Timezone. Postgres is setup with UTC as default timezone.

You can fetch your epoch time in your table by querying SELECT createdAt::timestamp, updatedAt::timestamp from table.

Now, whenever our Kafka Connector is fetching your epoch time it will map it to a string, like in your given example, 2020-01-27 08:25:32.245+00. Now with the configuration "db.timezone": "Asia/Kathmandu" we will let the connector know that we would liek the epoch to be mapped to our Asia/Kathmandu timezone string.

I hope it works :)

sajita commented 3 years ago

How did you configure your db.timezone? Mine looks like this: "db.timezone": "Europe/Vienna".

Did I understand correctly that you would like to use the Source Connector like in your given example to fetch data from the Postgres DB into your topic. It works but the given timestamp is in the DB's timezone and you would like to have displayed your current timezone?

May I ask you to also upload an example on how the Connector updates the timezone when it fetches it from your DB/Table?

Okay I see your point. Would you mind updating your configuration to "db.timezone": "Asia/Kathmandu".

So my two cents to the technical implementation and why I think the updated timezone should solve your problem.

Postgres is saving the timezone in an epoch time format. This epoch time is for everyone, worldwide, the same but to our convenience, Postgres is mapping it to our Timezone. Postgres is setup with UTC as default timezone.

You can fetch your epoch time in your table by querying SELECT createdAt::timestamp, updatedAt::timestamp from table.

Now, whenever our Kafka Connector is fetching your epoch time it will map it to a string, like in your given example, 2020-01-27 08:25:32.245+00. Now with the configuration "db.timezone": "Asia/Kathmandu" we will let the connector know that we would liek the epoch to be mapped to our Asia/Kathmandu timezone string.

I hope it works :)

thank you so much for the information but I have already tried it but it didn't work either.

sajita commented 3 years ago

if I set it to "Asia/Kathmandu" I will get the error like this:

image

enesyalcin commented 3 years ago

if I set it to "Asia/Kathmandu" I will get the error like this:

image

Did you also try Asia/Katmandu? Without an h?

sajita commented 3 years ago

yes, i did try "db.timezone: "ASia/Kathmandu"

enesyalcin commented 3 years ago

I checked my docker-compse.yaml file as well and I have seen that I also bound volumes in order to sync the timezone and localtime with my container/host.

    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
sajita commented 3 years ago

if I change my local time to UTC it works but this will change many other databases' dates too which is not the option for me. I am not understanding why UTC is not working in my case. The default value is UTC . what I am not getting is why it is updating data according to the server instead of connector?

enesyalcin commented 3 years ago

You could try some other timezone from Asia which is in the same timezone as you?

thebinary commented 3 years ago

@sajita does your error match the following on mentioning db.timezone parameter:

[2021-08-07 09:06:06,056] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect:57)
[2021-08-07 09:06:06,083] INFO AbstractConfig values:
 (org.apache.kafka.common.config.AbstractConfig:372)
[2021-08-07 09:06:06,089] ERROR Failed to create job for ../config/radius-json/connect-pg-test-test.properties (org.apache.kafka.connect.cli.ConnectStandalone:107)
[2021-08-07 09:06:06,090] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:117)
java.util.concurrent.ExecutionException: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector configuration is invalid and contains the following 1 error(s):
Invalid value "Asia/Kathmandu" for configuration db.timezone: Invalid time zone identifier
You can also find the above list of errors at the endpoint `/connector-plugins/{connectorType}/config/validate`
        at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:115)
        at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:99)
        at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:114)
Caused by: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector configuration is invalid and contains the following 1 error(s):

Is it this Invalid time zone identifier error?

thebinary commented 3 years ago

if yes i will post a fix PR for this, or i will have to open another issue

thebinary commented 3 years ago

i have opened PR #1101 if the error is Invalid time zone identifier