dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.58k stars 1.46k forks source link

grpcio@1.65 breaks `dagster dev` #22980

Closed leejlFG closed 3 months ago

leejlFG commented 3 months ago

Dagster version

1.7.12

What's the issue?

Latest grpcio release 1.65.0 breaks dagster dev command.

What did you expect to happen?

To have things spin up normally.

How to reproduce?

Upgrade grpcio to version 1.65.0 and watch the errors pile up.

Deployment type

Local

Deployment details

No response

Additional information

2024-07-11 14:40:36 -0500 - dagster - INFO - Launching Dagster services...
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1720726837.548077 12799771 config.cc:230] gRPC experiments enabled: call_status_override_on_cancellation, event_engine_dns, event_engine_listener, http2_stats_fix, monitoring_experiment, pick_first_new, trace_record_callops, work_serializer_clears_time_cache, work_serializer_dispatch
I0000 00:00:1720726837.550711 12799855 subchannel.cc:806] subchannel 0x10998cc50 {address=unix:/var/folders/qr/34lg_mxd51b2tkmz2sct0w4w0000gp/T/tmp1ak_wbwd, args={grpc.client_channel_factory=0x60000345a610, grpc.default_authority=var%2Ffolders%2Fqr%2F34lg_mxd51b2tkmz2sct0w4w0000gp%2FT%2Ftmp1ak_wbwd, grpc.default_compression_algorithm=2, grpc.internal.channel_credentials=0x60000345a680, grpc.internal.client_channel_call_destination=0x10ad875d8, grpc.internal.event_engine=0x600003451660, grpc.internal.security_connector=0x60000236d380, grpc.internal.subchannel_pool=0x600001d44000, grpc.max_receive_message_length=50000000, grpc.max_send_message_length=50000000, grpc.primary_user_agent=grpc-python/1.65.0, grpc.resource_quota=0x600003847990, grpc.server_uri=unix:/var/folders/qr/34lg_mxd51b2tkmz2sct0w4w0000gp/T/tmp1ak_wbwd}}: connect failed (UNKNOWN:connect: No such file or directory (2) {created_time:"2024-07-11T14:40:37.549418-05:00"}), backing off for 1000 ms
I0000 00:00:1720726837.655261 12799861 subchannel.cc:806] subchannel 0x109994360 {address=unix:/var/folders/qr/34lg_mxd51b2tkmz2sct0w4w0000gp/T/tmp1ak_wbwd, args={grpc.client_channel_factory=0x60000345a610, grpc.default_authority=var%2Ffolders%2Fqr%2F34lg_mxd51b2tkmz2sct0w4w0000gp%2FT%2Ftmp1ak_wbwd, grpc.default_compression_algorithm=2, grpc.internal.channel_credentials=0x60000345a680, grpc.internal.client_channel_call_destination=0x10ad875d8, grpc.internal.event_engine=0x6000034515a0, grpc.internal.security_connector=0x60000236d700, grpc.internal.subchannel_pool=0x600001d44000, grpc.max_receive_message_length=50000000, grpc.max_send_message_length=50000000, grpc.primary_user_agent=grpc-python/1.65.0, grpc.resource_quota=0x600003847990, grpc.server_uri=unix:/var/folders/qr/34lg_mxd51b2tkmz2sct0w4w0000gp/T/tmp1ak_wbwd}}: connect failed (UNKNOWN:connect: No such file or directory (2) {created_time:"2024-07-11T14:40:37.655207-05:00"}), backing off for 1000 ms

These errors just flow freely in the terminal until closing it, interspersed with:

I0000 00:00:1720726935.787034 12800685 tcp_posix.cc:809] IOMGR endpoint shutdown
I0000 00:00:1720726935.787079 12813667 work_stealing_thread_pool.cc:269] WorkStealingThreadPoolImpl::Quiesce

until force exiting the terminal.

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

tenhaus commented 3 months ago

Also TypeError: ForwardRef._evaluate() missing 1 required keyword-only argument: 'recursive_guard' when creating a new asset

PedramNavid commented 3 months ago

Hi @leejlFG, thanks for reporting this. While we investigate, I would recommend installing an older version of grpc: pip install --upgrade 'grpcio<1.65' appears to solve this on my computer.

cbini commented 3 months ago

PS this also appears to break schedules:

TypeError: ForwardRef._evaluate() missing 1 required keyword-only argument: 'recursive_guard'

  File "/app/.venv/lib/python3.12/site-packages/dagster/_grpc/impl.py", line 533, in get_external_execution_plan_snapshot
    create_execution_plan(
  File "/app/.venv/lib/python3.12/site-packages/dagster/_core/execution/api.py", line 719, in create_execution_plan
    return ExecutionPlan.build(
           ^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/dagster/_core/execution/plan/plan.py", line 1067, in build
    ).build()
      ^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/dagster/_core/execution/plan/plan.py", line 182, in build
    self._build_from_sorted_nodes(
  File "/app/.venv/lib/python3.12/site-packages/dagster/_core/execution/plan/plan.py", line 307, in _build_from_sorted_nodes
    StepInput(
  File "/app/.venv/lib/python3.12/site-packages/dagster/_record/__init__.py", line 331, in __call__
    self._build_checked_new_str(),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/dagster/_record/__init__.py", line 342, in _build_checked_new_str
    call_str = build_check_call_str(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/dagster/_check/builder.py", line 219, in build_check_call_str
    inst_type = _coerce_type(ttype, eval_ctx)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/dagster/_check/builder.py", line 131, in _coerce_type
    return eval_ctx.eval_forward_ref(ForwardRef(ttype))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/dagster/_check/builder.py", line 99, in eval_forward_ref
    return ref._evaluate(self.get_merged_ns(), {}, frozenset())  # noqa
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
alangenfeld commented 3 months ago

@cbini @tenhaus that issue is separate from this grpcio@1.65 problem https://github.com/dagster-io/dagster/issues/22985

tenhaus commented 3 months ago

@cbini @tenhaus that issue is separate from this grpcio@1.65 problem #22985

You're right. Was about the comment but got distracted by a dagster account executive adding me on LinkedIn. SHARKS! lol

garethbrickman commented 3 months ago

This looks to be root caused by a grpcio-specific bug in v1.65: https://github.com/grpc/grpc/issues/37178

As a result, they have yanked their release of v1.65: ­­https://pypi.org/project/grpcio/1.65.0/

LaurentiuStancioiu commented 3 months ago

Could it be that this error: TypeError: ForwardRef._evaluate() missing 1 required keyword-only argument: 'recursive_guard' is an incompatibility between python 3.12.4 and pydantic v1 (taken from here: https://github.com/langchain-ai/langgraph/issues/639). For example, I downgraded to python 3.12.3 and everything works. I also had grpcio 1.64.1.

gibsondan commented 3 months ago

@LaurentiuStancioiu that ForwardRef error is being tracked here: https://github.com/dagster-io/dagster/issues/22985 We have a fix going out in the next dagster release.

LaurentiuStancioiu commented 3 months ago

@gibsondan My bad haven't seen the reference which was just above! Thank you!

leo-schmidt commented 2 months ago

This looks to be root caused by a grpcio-specific bug in v1.65: grpc/grpc#37178

As a result, they have yanked their release of v1.65: ­­https://pypi.org/project/grpcio/1.65.0/

Upgrading to grpcio 1.65.4 has removed the excessive logging for me. Other people in that thread above have said the same so that seems like a good solution for now.

gibsondan commented 2 months ago

Thanks for the headsup, I can reproduce that as well - we'll go ahead and remove the pin now: https://github.com/dagster-io/dagster/pull/23645