MarquezProject / marquez

Collect, aggregate, and visualize a data ecosystem's metadata
https://marquezproject.ai
Apache License 2.0
1.78k stars 319 forks source link

Verify and document default Postgres timezone behavior #56

Closed ashulmanWeWork closed 5 years ago

ashulmanWeWork commented 6 years ago

Create a standard for dealing with timezones in the data, specifically around whether it's required/recommended to include them in time data.

This includes how the DB schema will address timestamp info, how the Jackson mapper treats timestamps that don't include TZ info, and how Marquez treats data in the Timestamp object.

By default, it seems that Postgres does not record TZ info in its timestamp. This can cause problems down the line, so I think for now it's advisable to create types as TIMESTAMP WITH TIME ZONE instead of just TIMESTAMP.

Reference: https://www.postgresql.org/docs/9.6/static/datatype-datetime.html

sshah-wework commented 6 years ago

One question -- if we're relying on created_at and updated_at to be populated by the database, and the database has a default timezone of UTC, can't we then assume every timestamp we're handling is in the UTC timezone?

sshah-wework commented 6 years ago

We also may not expose our internal timestamps via the DAO or Resources. They're really for our own internal book keeping. I think you are correctly identifying something to be aware of in situations where we're accepting timezone-specific timestamps as API parameters and then storing/retrieving them later.

sshah-wework commented 6 years ago

We can merge this -- it will still protect us in the case that the DB TZ and the application server TZ are different.

wslulciuc commented 5 years ago

@ashulmanWeWork @sshah-wework: All timestamps will be assumed to be in UTC with TIMESTAMPTZ