StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.65k stars 1.75k forks source link

[function] Support millisecond unix to datetime #48634

Open wangsimo0 opened 1 month ago

wangsimo0 commented 1 month ago

when it comes to unix time, there are two ways to specify milliseconds. most system will be either using float like 159830640.123 , or append millisecond in the last like 1598306400123 to represent datetime with milliseconds.

In StarRocks, we use the standard UNIX time without any milli or micro , nano seconds. However, there may be some case that user import data from other system who use other kinds of UNIX time format that starrocks cannot recognize and need further transformation.

so here I propose a function that may help to make this process easier.

from_unixtime

Converts a number of milliseconds since the epoch (1970-01-01 00:00:00 UTC) into a timestamp.

from_unixtime_milliseconds (INT unix_timestamp_milliseconds[, VARCHAR string_format])

this function's behavior is similar to from_unixtime the only difference is that:

the from_unixtime_milliseconds need to be a UNIX timestamp with millisecond as int.

for example

select from_unixtime_milliseconds(1196440219123)

2007-12-01 12:30:19.123000

furthermore, if the data use decimal to represent milliseconds, it can be

select from_unixtime_milliseconds(1196440219.123*1000)

2007-12-01 12:30:19.123000
jiangyinzuo commented 1 month ago

Hello @wangsimo0, I am willing to handle this issue. 😄

Dshadowzh commented 1 month ago

@jiangyinzuo, by the way, would you be interested in implementing a to_unixtime_milliseconds function as a backward function to from_unixtime_milliseconds?

bigint to_unixtime_milliseconds(datetime dt)

> select to_unixtime_milliseconds("2007-12-01 12:30:19.123000")
1196440219123
wangsimo0 commented 1 month ago

yeah btw for Dshadowzh's comment, this one unix_timestamp is a good reference if you are interested.

jiangyinzuo commented 1 month ago

@jiangyinzuo, by the way, would you be interested in implementing a to_unixtime_milliseconds function as a backward function to from_unixtime_milliseconds?

OK

jaogoy commented 1 month ago

@wangsimo0 There is a function from_unixtime_ms which is used in expression partition.

johnpyp commented 1 month ago

As an alternative, why not implement a variant of DATETIME which has different precision? This makes it generally easier to work with as a native type. This is also how Clickhouse, Doris, and most other similar databases seem to do it.


EDIT: Apologies, I'm new to StarRocks - DATETIME does support fractional seconds just fine.

The docs don't specify this at all though and suggest that only second-granularity is supported. It'd be great to make clear that it's actually arbitrary (or microsecond? nanosecond?) precision which is supported - https://docs.starrocks.io/docs/sql-reference/data-types/date-types/DATETIME/