dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.16k stars 1.4k forks source link

Multiprocess executor: child process for step <op name> was terminated by signal 6 (SIGABRT) in local dev environment #19149

Open bobevenup opened 8 months ago

bobevenup commented 8 months ago

Dagster version

1.4.4

What's the issue?

I have Dagster same version previously working, but have stop working recently. everytime i tried to start a job. It failed immediately with the following errors:

Multiprocess executor: child process for step <op name> was terminated by signal 6 (SIGABRT).
dagster._core.executor.child_process_executor.ChildProcessCrashException

Stack Trace:
  File "/code//venv3.11/lib/python3.11/site-packages/dagster/_core/executor/multiprocess.py", line 253, in execute
    event_or_none = next(step_iter)
                    ^^^^^^^^^^^^^^^
,  File "/code//venv3.11/lib/python3.11/site-packages/dagster/_core/executor/multiprocess.py", line 363, in execute_step_out_of_process
    for ret in execute_child_process_command(multiproc_ctx, command):
,  File "/code//venv3.11/lib/python3.11/site-packages/dagster/_core/executor/child_process_executor.py", line 174, in execute_child_process_command
    raise ChildProcessCrashException(exit_code=process.exitcode)

i have a simpler job that is working fine but more complicated jobs are failing. it happens for all ops in the beginning of a job accompanied with this dialog from OS

Screenshot 2024-01-10 at 10 12 08 AM

The detail report from dialog is attached error report.log

What did you expect to happen?

Job should run successfully

How to reproduce?

i start dagster using following command: dagster dev -p 3001 -w /Users/bobbui/Documents/evenup/source/document-extraction-pipelines/src/workspace.yaml

Deployment type

Local

Deployment details

my environment: Macbook M2, Sonoma 14.2.1 python 3.9.15

Additional information

i tried so many different fixes to no avail, but same error occurred:

Really appreciate any help to fix, because I literally can't work.

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

sudowoodo200 commented 2 months ago

Commenting to bump this up again. I think DynamicGraphs doesn't work because of this. Here's a minimal repro. There is no way this is an OOM issue. It's also very flakey and crashes at inconsistent places

import time
from dagster import DynamicOut, DynamicOutput, Output, job, op

@op(out=DynamicOut(int), tags={"dagster/priority": 0})
def start():
  for i in range(4):
    yield DynamicOutput(i, mapping_key=str(i))
    time.sleep(1)

@op(tags={"dagster/priority": 2})
def square(x: int):
  return Output(x**2, metadata={"squared": x**2})

@op(tags={"dagster/priority": 2})
def cube(x: int):
  return Output(x**3, metadata={"cubed": x**3})

@op(tags={"dagster/priority": 2})
def combine(square:int, cube:int):
  return Output(square+cube, metadata={"sum": square+cube})

@job(config={"execution": {"config": {"multiprocess": {"max_concurrent": 8}}}})
def run():
  fn = lambda x: combine(square(x),cube(x))
  output = start().map(fn).collect()
  print("Results", output)

Update: This was actually the same bug as https://github.com/dagster-io/dagster/issues/15947. May not be exactly what OP experienced.