segmentio / parquet-go

Go library to read/write Parquet files
https://pkg.go.dev/github.com/segmentio/parquet-go
Apache License 2.0
341 stars 58 forks source link

Serialize time.Time as a timestamp #321

Closed lmarburger closed 1 year ago

lmarburger commented 2 years ago

Serialize time.Time values as Parquet timestamps. The default unit is NANOS and can be changed using the timestamp() struct tag.

type timeColumn struct {
    t1 time.Time
    t2 time.Time `parquet:",timestamp(millisecond)"`
}

Resolves #266

wolfeidau commented 2 years ago

Is there any plans to merge this at some point?

Can i help with contributions?

lmarburger commented 2 years ago

We're going to be merging #387 and #393 soon and then I'll come back to this PR and clean it up. #393 is especially important because it allows us to correctly interpret the Parquet timestamp units to build the appropriate time.Time.

lmarburger commented 2 years ago

I believe this change is blocked on #365. I don't think logical type information is being written to the parquet column information which causes times to be deserializable only as int64. I'm looking into it.