dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.12k stars 1.39k forks source link

Clarify in deployment docs that Postgres/MySQL storage must be in the UTC timezone #23173

Open axellpadilla opened 1 month ago

axellpadilla commented 1 month ago

Dagster version

1.7.14

What's the issue?

On both the UIs (classic and experimental), the timezone is buggy, jobs runs missing from overview and also, the run time on runs is applied the timezone twice (for example, I have -6, so it says 3 instead of 10 for a 16 UTC).

Time is correct on overview (but jobs run missing)

What did you expect to happen?

Correct times

How to reproduce?

While exact steps I'm not sure, because it just is a it is, this is a problem on time zone CST and don't know others, checking the code that handles the conversion for just that part could show the problem.

Deployment type

Local

Deployment details

Just plain local on an ubuntu latests LTS version machine with already configured timezone

Additional information

Could be related to #16499 (or that should be closed)

Message from the maintainers

Impacted by this issue? Give it a đź‘Ť! We factor engagement into prioritization.

garethbrickman commented 1 month ago

Could you share full page screenshots of specifically where in the UI you're seeing discrepancies with the timezone in timestamps?

Please also let us know what web browser and version you're viewing Dagster in.

axellpadilla commented 1 month ago

Edge, but no difference with other browsers: runs: image

overview with correct time but missing runs: image

axellpadilla commented 1 month ago

just adding, same problem in 1.7.15

hellendag commented 1 month ago

Just so I'm sure I understand, are the timestamps incorrect on the Runs page (first screenshot) and correct on the timeline view (second screenshot)?

Would you mind sharing a screenshot of your User Settings dialog, including the timezone and hour cycle selections?

edwinvehmaanpera commented 1 month ago

We have the same issue.

kptres commented 1 month ago

Similar issue for BST image The jobs highlighted in red ran at 8.02 but is appearing at 7.02 on the ui. "Now" is correct.

User settings image

The runs page is displaying the incorrect time as well.

Dagster code server, webserver, daemon is set to timezone BST and in the database all timestamps are in BST as well.

Postgres timezone is UTC, Changing it to local time has no effect

axellpadilla commented 1 month ago

Hi @hellendag

Do you need help finding the problem? If you can provide the files to check or have anything to help understand the code structure related to this functionality let me know!

gibsondan commented 1 month ago

@axellpadilla is there any chance you are able to test this on 1.7.12? I am wondering if https://github.com/dagster-io/dagster/pull/22818 that went out in 1.7.13 could have somehow caused this.

gibsondan commented 1 month ago

Can you also describe what type of storage you are using (SQLite vs Postgres)? From the description here it sounds like runs may be going into the DB with non-UTC timestamps which may be the source of the problem - the expectation is that they are in UTC (with no time zone) but it’s possible that’s not happening the way it is supposed to for some storage types.

gibsondan commented 1 month ago

One thing to double check is that if you're using postgres or mysql, your database is running in the UTC timezone? I can reproduce something like this if I spin up a local postgres database and set the "TZ" env var on it to "America/New_York" instead of the default of "UTC" - I create a run and the creation time is off by 4 hours. I know at least one person on this thread said that their postgres DB was in UTC though, so there may be something else going on here besides that.

kptres commented 1 month ago

changing the timezone in postgresql.conf to UTC and running ALTER DATABASE xxx SET timezone TO 'UTC'; resolved the issue

edwinvehmaanpera commented 1 month ago

Yes, changing the timezone seems to have solved the issue. You can also set an environmental variable PGTZ=UTC on the Dagster instances if you cannot change the database timezone.

However, there still seems to be an opportunity to improve the docs (is the UTC requirement stated anywhere explicitly?) and Dagster could warn or fail if the DB is in the wrong timezone.

gibsondan commented 1 month ago

Absolutely - filed this as a docs improvement.

axellpadilla commented 1 month ago

Hi guys, setting ALTER DATABASE xxx SET timezone TO 'UTC' alone fixed this issue, I understand the database isn't timezone aware because performance improvement over large datasets, so it should be definitely documented (not "fixed").

PD: Checking the database timezone on load would be great to show a warning