dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.18k stars 1.4k forks source link

error in view schedules and scheduled-runs #19892

Closed qqletter closed 6 months ago

qqletter commented 7 months ago

Dagster version

dagster, version 1.6.5

What's the issue?

I updated to the latest version, and I did not do any changes to schedules(21 schedules existed). When I open the web page "overview - schedules" or "runs - scheduled", some errors as below:

Unexpected GraphQL error

Operation name: ScheduledRunsListQuery

Message: Invariant failed.

Path: ["repositoriesOrError","nodes",0,"schedules",1,"futureTicks"]

Locations: [{"line":107,"column":3}]

Stack Trace: File "/usr/local/lib/python3.11/site-packages/graphql/execution/execute.py", line 521, in execute_field result = resolve_fn(source, info, **args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/dagster_graphql/schema/schedules/schedules.py", line 157, in resolve_futureTicks tick_times.append(next(time_iter).timestamp()) ^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/dagster/_utils/schedules.py", line 833, in schedule_execution_time_iterator yield from ( File "/usr/local/lib/python3.11/site-packages/dagster/_utils/schedules.py", line 773, in cron_string_iterator yield from _croniter_string_iterator( File "/usr/local/lib/python3.11/site-packages/dagster/_utils/schedules.py", line 802, in _croniter_string_iterator check.invariant(next_date.timestamp() >= start_timestamp) File "/usr/local/lib/python3.11/site-packages/dagster/_check/init.py", line 1614, in invariant raise CheckError("Invariant failed.")

gibsondan commented 7 months ago

Hi, we’d like to see if we can reproduce this error so that we can figure out what’s going on and fix Is it possible to:

qqletter commented 7 months ago
  • pip freeze

aiohttp==3.9.3 aiosignal==1.3.1 akracer==0.0.13 akshare==1.12.58 alembic==1.13.1 aniso8601==9.0.1 anyio==3.6.2 async-timeout==4.0.3 asyncpg==0.29.0 attrs==23.2.0 backoff==2.2.1 beautifulsoup4==4.12.3 bokeh==3.3.4 bs4==0.0.2 cattrs==22.2.0 certifi==2024.2.2 cfscrape==2.1.1 charset-normalizer==3.3.2 click==8.1.7 clickhouse-connect==0.7.0 clickhouse-driver==0.2.6 coloredlogs==14.0 contourpy==1.2.0 croniter==1.4.1 cycler==0.12.1 dagit==1.6.5 dagster==1.6.5 dagster-graphql==1.6.5 dagster-pipes==1.6.5 dagster-webserver==1.6.5 decorator==5.1.1 docstring-parser==0.15 duckdb==0.10.0 et-xmlfile==1.1.0 fonttools==4.49.0 frozenlist==1.4.1 fsspec==2024.2.0 gql==3.4.1 graphene==3.3 graphql-core==3.2.3 graphql-relay==3.2.0 grpcio==1.51.3 grpcio-health-checking==1.51.3 h11==0.14.0 hs-udata==0.3.8 html5lib==1.1 httptools==0.6.1 humanfriendly==10.0 idna==3.6 importlib-metadata==7.0.1 Jinja2==3.1.3 joblib==1.3.2 jsonpath==0.82.2 kiwisolver==1.4.5 loguru==0.7.2 lxml==4.9.3 lz4==4.3.3 Mako==1.3.2 markdown-it-py==3.0.0 MarkupSafe==2.1.5 matplotlib==3.8.3 mdurl==0.1.2 megadata==0.0.77 multidict==6.0.5 numpy==1.26.4 openpyxl==3.1.2 packaging==23.2 pandas==2.2.0 pendulum==2.1.2 pillow==10.2.0 platformdirs==4.2.0 polars==0.20.9 protobuf==4.25.3 py-mini-racer==0.6.0 pyaml==23.12.0 pyarrow==15.0.0 pydantic==1.10.9 Pygments==2.17.2 pyparsing==3.1.1 pypinyin==0.50.0 python-dateutil==2.8.2 python-dotenv==1.0.1 pytomlpp==1.0.13 pytz==2024.1 pytz-deprecation-shim==0.1.0.post0 pytzdata==2020.1 PyYAML==6.0.1 requests==2.31.0 requests-cache==1.2.0 requests-toolbelt==0.10.1 rich==13.7.0 scikit-learn==1.4.1.post1 scikit-optimize==0.9.0 scipy==1.12.0 simplejson==3.19.2 six==1.16.0 sniffio==1.3.0 soupsieve==2.5 SQLAlchemy==1.4.46 stackprinter==0.2.11 starlette==0.27.0 structlog==24.1.0 TA-Lib==0.4.28 tabulate==0.9.0 threadpoolctl==3.3.0 tomli==2.0.1 toposort==1.10 tornado==6.4 tqdm==4.66.2 tushare==1.3.7 typing_extensions==4.9.0 tzdata==2024.1 tzlocal==4.3 universal-pathlib==0.0.24 url-normalize==1.4.3 urllib3==1.26.15 uvicorn==0.27.1 uvloop==0.19.0 watchdog==2.3.1 watchfiles==0.21.0 webencodings==0.5.1 websocket-client==0.57.0 websockets==10.4 xlrd==2.0.1 xyzservices==2023.10.1 yarl==1.9.4 zipp==3.17.0 zstandard==0.22.0


I've tried every schedules, some of them will cause errors certainly, as below: schedules:

realtime_1 = ScheduleDefinition(cron_schedule="/5 9-15 * 1-5", job=realtime_job_1, execution_timezone="Asia/Shanghai")

realtime_2 = ScheduleDefinition(cron_schedule="1/3 9-15 1-5", job=realtime_job_2, execution_timezone="Asia/Shanghai")

transform_1 = ScheduleDefinition(cron_schedule="/10 19,20 * 1-6", job=transform_job_1, execution_timezone="Asia/Shanghai")

qqletter commented 7 months ago

I run dagster on docker image(python:3.11-slim), and I use the default embed sqlite.

gibsondan commented 7 months ago

Hm strange - I'm having trouble reproducing the problem even with those same requirements and cronstrings/timezones. Are you able to reproduce the problem if you load this simple case in the Dagster UI? I tried that on each of the cronstrings you provided but was able to view upcoming runs for each of the schedules.

from dagster import Definitions, job, op, schedule, ScheduleDefinition

@op
def my_op():
    pass

@job
def my_job():
    my_op()

@schedule(job=my_job, cron_schedule="*/10 19,20 * * 1-6", execution_timezone="Asia/Shanghai")
def my_schedule():
    pass

realtime_1 = ScheduleDefinition(name="realtime_1", cron_schedule="*/5 9-15 * * 1-5", job=my_job, execution_timezone="Asia/Shanghai")

realtime_2 = ScheduleDefinition(name="realtime_2", cron_schedule="1/3 9-15 * * 1-5", job=my_job, execution_timezone="Asia/Shanghai")

transform_1 = ScheduleDefinition(name="transform_1", cron_schedule="*/10 19,20 * * 1-6", job=my_job, execution_timezone="Asia/Shanghai")

defs = Definitions(jobs=[my_job], schedules=[my_schedule, realtime_1, realtime_2, transform_1])
gibsondan commented 7 months ago

Here's exactly what I did to try to reproduce this:

FROM python:3.11-slim

RUN pip install -U pip

COPY requirements.txt .

RUN pip install -r requirements.txt

COPY repo.py .

EXPOSE 3000

CMD dagster dev -f repo.py --dagit-host 0.0.0.0

Is there anything else that's unique about your environment or code that differs from that example that might help explain why that's not reproducing the problem?

qqletter commented 7 months ago

Here's exactly what I did to try to reproduce this:

  • save the dependencies you sent as a requirements.txt file and dagster code file i posted above saved as repo.py to a folder
  • build the following dockerfile:
FROM python:3.11-slim

RUN pip install -U pip

COPY requirements.txt .

RUN pip install -r requirements.txt

COPY repo.py .

EXPOSE 3000

CMD dagster dev -f repo.py --dagit-host 0.0.0.0
  • Build it and run the image with port 3000 exposed
  • Turn on all the schedules from the dagster UI
  • View the scheduled runs page, see:
image
  • View the page for each schedule, see:
image

Is there anything else that's unique about your environment or code that differs from that example that might help explain why that's not reproducing the problem?

I've tried your example, and it works well. Emmm, I'm going to more tests and feed you back later.

qqletter commented 6 months ago

@gibsondan I've found the reason, I set timezone in Dockerfile in a wrong way, as below: RUN cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \ && "Asia/Shanghai" > /etc/timezone

I use 'ENV TZ="Asia/Shanghai"' instead, and no errors any more.

gibsondan commented 6 months ago

oh nice, good catch! I think it's likely that that confused the 'croniter' package that we use to compute cronstrings.

boenshao commented 6 months ago

Just encountered the same issue, we have a job scheduled on timezone Asia/Taipei as follows.

@schedule(
    job=data_quality_check_job,
    cron_schedule="50 16 * * *",
    execution_timezone="Asia/Taipei",
    default_status=DefaultScheduleStatus.RUNNING,
)

We host dagster-webserver and dagster-daemon with docker-compose similar to the below example.

  dagster-webserver:
    image: dagster:1.6.9
    environment:
      <<: *dagster-envs
    command: dagster-webserver -h 0.0.0.0 -p 3000
    ports:
      - "3000:3000"

  dagster-daemon:
    image: dagster:1.6.9
    environment:
      <<: *dagster-envs
    command: dagster-daemon run

On our machine, it seems both the host and containers are under the same timezone setting, as checked by the date command, note that CST here implies China Standard Time or Chungyuan Standard Time, the zone for Asia/Taipei.

xxx@host:~$ date
Wed 13 Mar 2024 01:04:14 AM CST
xxx@host:~$ docker exec -it dagster_dagster-daemon_1 sh
# date
Wed Mar 13 01:04:26 CST 2024
#
xxx@host:~$ docker exec -it dagster_dagster-webserver_1 sh
# date
Wed Mar 13 01:04:34 CST 2024
#

With the setup above we got encountered the reported issue.

The fix is to add TZ=Asia/Taipei to the containers,

  dagster-webserver:
    image: dagster:1.6.9
    environment:
      <<: *dagster-envs
      TZ: "Asia/Taipei"
    command: dagster-webserver -h 0.0.0.0 -p 3000
    ports:
      - "3000:3000"

  dagster-daemon:
    image: dagster:1.6.9
    environment:
      <<: *dagster-envs
      TZ: "Asia/Taipei"
    command: dagster-daemon run
gibsondan commented 6 months ago

@boenshao thanks for the additional context - do you happen to know what was determining the timezone on those containers before you explicitly set the TZ env var? I'd like to see if I can reproduce this.

boenshao commented 6 months ago

I just double-checked, we actually had mounted the host /etc/localtime into the container!

  dagster-webserver:
    image: dagster:1.6.9
    volumes
      - /etc/localtime:/etc/localtime:ro
    command: dagster-webserver -h 0.0.0.0 -p 3000
    ports:
      - "3000:3000"

  dagster-daemon:
    image: dagster:1.6.9
    volumes
      - /etc/localtime:/etc/localtime:ro
    command: dagster-daemon run

As I removed the mount, the timezone in the container changed to UTC

xxx@host:~$ docker exec -it dagster_dagster-daemon_1 sh
# date
Tue Mar 12 17:58:39 UTC 2024

And the scheduler runs fine without TZ=Asia/Taipei.

gibsondan commented 6 months ago

I'm starting to suspect this may be a bug with pendulum (the datetime library that we use)

I tried adding this to the Dockerfile (as per the originaly report):

RUN cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
&& echo "Asia/Shanghai" > /etc/timezone

and then ran this code:

>>> str(pendulum.now(tz="Asia/Shanghai"))
'2024-03-13 02:06:09.182025+08:00'
>>> str(pendulum.from_timestamp(time.time(), tz="Asia/Shanghai"))
'2024-03-12 18:06:17.857420+08:00'

whereas without adding that to the dockerfile, they produce the same result, as expected:

>>> str(pendulum.now(tz="Asia/Shanghai"))
'2024-03-13T02:06:24.118852+08:00'
>>> str(pendulum.from_timestamp(time.time(), tz="Asia/Shanghai"))
'2024-03-13T02:06:26.099418+08:00'
gibsondan commented 6 months ago

It also looks like this broke in the 3.0.0 pendulum release. I'll file an issue, thanks for the report!

gibsondan commented 6 months ago

https://stackoverflow.com/questions/74467999/why-does-zoneinfoutc-do-different-time-conversions-from-timezone-utc

It looks like pendulum is affected by this issue because in their 3.0 release they moved their Timezone object to subclass zoneinfo.ZoneInfo - if you modify /etc/timezone (Or I guess /etc/localtime) it causes the UTC timezone to have a non-zero offset (the same as whatever timezone you specified), which breaks all kinds of things