ClickHouse / clickhouse-go

Golang driver for ClickHouse
Apache License 2.0
2.91k stars 562 forks source link

DateTime handling - consistency with client specification #719

Closed gingerwizard closed 2 years ago

gingerwizard commented 2 years ago

As part of supporting the ClickHouse-go client we wish to ensure the handling of datetimes is consistent with other client libraries. The following represents the agreed behavior for all clients:

Specification

DateTime/DateTime64

Insert Time

The value is sent to ClickHouse in UNIX timestamp format. The client behavior depends on the network protocol if the user provided value as a string. The client has to parse the string to send the value as a UNIX timestamp.

The timezone value will be used if the given string contains a timezone. Otherwise, the client falls back to the local timezone of the client.

If a datetime object is provided (time.Time{} in go), the timezone will be considered if provided - otherwise, the local time zone.

Select Time

If the clients have to return DateTime represented as Date object instance with timezone set, the client has to retrieve column timezone from ClickHouse with a fallback to the server timezone if column tz is not specified.

Date types

Insert Time

The client makes an offset to ensure the client's local date is inserted. Dates don't have tz so no need to consider.

Select Time

Date is returned without timezone.

ClickHouse-go compliance

The following needs to be completed to be compliant with the above:

gingerwizard commented 2 years ago

@DGuang21 it wont let me assign this. Noted the issue that's relevant to your work above.

DGuang21 commented 2 years ago

@gingerwizard I have some questions about the compatible type scheme. Regarding the time type, how many do I need to be compatible with? Are they all (https://github.com/araddon/dateparse#extended-example), or just the ones that come with Golang?

ANSIC "Mon Jan _2 15:04:05 2006"
UnixDate "Mon Jan _2 15:04:05 MST 2006"
RubyDate "Mon Jan 02 15:04:05 -0700 2006"
RFC822 "02 Jan 06 15:04 MST"
RFC822Z "02 Jan 06 15:04 -0700"
RFC850 "Monday, 02-Jan-06 15:04:05 MST"
RFC1123 "Mon, 02 Jan 2006 15:04:05 MST"
RFC1123Z "Mon, 02 Jan 2006 15:04:05 -0700"
RFC3339 "2006-01-02T15:04:05Z07:00"
RFC3339Nano "2006-01-02T15:04:05.999999999Z07:00"
Stamp "3:04PM"
StampMilli "Jan_2 15:04:05.000"
StampMicro "Jan_2 15:04:05.000000"
StampNano "Jan_2 15:04:05.000000000"

Or Golang default 2022-08-14 17:23:12.679127 +0800 CST m=+0.007396793

gingerwizard commented 2 years ago

We will only support YYYY-MM-DD HH:MM:SS Z or YYYY-MM-DD HH:MM:SS. For DateTime64 we would also support up to nano seconds with this format e.g. YYYY-MM-DD HH:MM:SS.XXXXXXXXX Z

DGuang21 commented 2 years ago

@gingerwizard so only support 2022-08-14 17:23:12.679127 +0800 ? if yes i can finish it soon

gingerwizard commented 2 years ago

For:

Keep it simple - write in a way which means we can add patterns later however.

DGuang21 commented 2 years ago

@gingerwizard The code development in my task is basically completed, and now only test cases have not been added, and there are still some small problems to be dealt with

  1. The code in https://github.com/DGuang21/clickhouse-go/pull/2, I think you can look at the implementation first
  2. For Date/date32 data type, the database does not support writing the time zone. If the user pass parameters 2022-08-16+0800, need the time zone of time to be UTC+0000 and then write 2022-08-15 instead of 2022-08-16?
gingerwizard commented 2 years ago

(2) - yes i think this makes sense - you would insert 2022-08-15 in this case. if the user retrieves and sets the timezone they will get the original date.

(1) This is a critical code path - can we avoid regexes and just try the patterns? i.e. try the respective patterns - a simple length check should be sufficient as a heuristic re tz or no tz pattern

@DGuang21

DGuang21 commented 2 years ago

@gingerwizard I have something to do these days, I will try my best to finish the revision before the weekend.

DGuang21 commented 2 years ago

@gingerwizard But if it is changed to distinguish by length, I may not modify the general logic of this piece. Because when writing the date/date32 type, if it is a time type, it will be adjusted according to the complete time and time zone, but if it is changed to a String type, it only matches the date part. I think there will be an inconsistency here.

I don't think such a simple and crude solution is a good way

gingerwizard commented 2 years ago

k @DGuang21 lets go for a precompiled regex pattern. If people use strings they accept this will be the slow path. We'll need to document as such.

gingerwizard commented 2 years ago

If no tz is set for time.Time{} or string, use the client tz.

For time.Time{} this is not possible since nill tz means UTC. There is no way to differentiate between UTC and no timezone i.e. local. We can only do this on strings. @mshustov