Closed mattwelke closed 3 years ago
Actually, I think I found what I was looking for: https://github.com/xitongsys/parquet-go/blob/master/types/converter.go
(the answer being yes - you need to store timestamps as integers and do the conversion yourself)
I just didn't expect to need this file when I was skimming the README. But I'm glad it's documented.
You mention using the converter.go to convert times, how exactly do you use this?
Do you have to define your struct as:
type user struct {
ID string `parquet:"name=id,type=BYTE_ARRAY,convertedtype=UTF8,encoding=PLAIN_DICTIONARY"`
CreatedAt int64 `parquet:"name=created_at, type=TIME_MICROS, encoding=PLAIN_DICTIONARY"`
}
@wwaldner-amtelco Yup I remember writing code that used an integer data type. int64
sounds right. And then I figured that in the real world if I ever used this, I would create a Go struct just for marshalling to and from Parquet. If I wanted a Time representation of my int64 field, I'd add a method, or I'd add some mapping code to map the "to/from Parquet" struct to a domain model struct, so I'd end up with my time.Time
field like I wanted.
This was just for tinkering on the side. I didn't put anything using this into production so I never got to test that idea.
I was testing out the library and wrote a struct like this:
This worked. I was able to use code to marshall and unmarshal the Parquet data. I got the code for these steps from a blog post:
My main function just makes a user struct and writes it and then reads it back:
I get the following output:
But if I add a
time.Time
field, and choose a type from the README that made sense to me (I chose TIME_MICROS), it doesn't work. I get an error:It does work if I switch the type of
CreatedAt
in my struct to an integer type though, likeint64
. Then, I'd have to write my own transforming code when I want to marshal and unmarshal to and from Parquet, which would convert between integers andtime.Time
. This caught me off guard because usually I seetime.Time
supported in Go natively, like with JSON marshalling. It converts it to a time string etc and can parse that back, etc.Is this manual conversion between integer types and
time.Time
the way to use Parquet in Go?