Open egnor opened 1 year ago
Thanks for debugging and reporting this! I wanted to confirm whether ShellOperation
works for you?
Also, I don't think I was able to reproduce using:
other.py
from prefect_shell import shell_run_command
from prefect import flow
@flow
def test_flow():
shell_run_command("python test.py")
test_flow()
test.py
#!/usr/bin/env python3
import sys
for i in range(10000):
print("*" * 40, i, file=sys.stderr)
python other.py
ShellOperation
definitely works and so we moved to using it.
I'll see if I can make a contained repro case for you...! Might be some other key ingredient. (@sckott FYI)
Thanks! I think we may be able to refactor the internals of shell_run_command to use ShellOperation block eventually.
Might be some other key ingredient.
Nothing I can think of. Thx for opening the issue @egnor
I think we may be able to refactor the internals of shell_run_command to use ShellOperation block eventually
This seems like a good solution
I have had the same issue - however I am not sure if this has actually something to do with printing to stderr or if it is just because of printing or if it is a result of just a long running process itself.
I tried building a minimal example that also includes (the probably weird) way I set up the agent: minimal-example-prefect.zip
As in the previous responses ShellOperation
seems to run better.
Thanks for debugging and reporting this! I wanted to confirm whether
ShellOperation
works for you? Also, I don't think I was able to reproduce using: ...
When I try your example @ahuang11, exactly as you write it, it does hang here:
% python3 other.py
21:56:18.489 | INFO | prefect.engine - Created flow run 'valiant-cormorant' for flow 'test-flow'
21:56:18.493 | INFO | Flow run 'valiant-cormorant' - View at https://app.prefect.cloud/account/33d360a7-efaa-46ce-98ac-22df4011d411/workspace/2edd22be-f53a-4d0e-93dc-66e772f636a1/flow-runs/flow-run/a51901a4-55a0-475d-ba79-3a41af44a22c
21:56:19.512 | INFO | Flow run 'valiant-cormorant' - Created task run 'shell_run_command-0' for task 'shell_run_command'
21:56:19.515 | INFO | Flow run 'valiant-cormorant' - Executing 'shell_run_command-0' immediately...
I'm not sure why it's not hanging for you @ahuang11 -- perhaps on your system, shell pipes have unbounded length? I'm running on linux (Ubuntu 22.04, Linux 5.19.0-43-generic).
Expectation / Proposal
The expectation is that
shell_run_command
should be able to run any reasonable process.Traceback / Example
To reproduce, use
shell_run_command
to invoke a simple script like thisAt least in our experience, the command (and thus the
shell_run_command
function) blocks forever, which is unfortunate and undesired. I believe the reason is this logic inshell_run_command
: https://github.com/PrefectHQ/prefect-shell/blob/1d89814710c35cf5f12832d9188f1d8d7c59b048/prefect_shell/commands.py#L97In summary, the logic is
anyio.open_process
)stdout
output (and log it) until EOFstderr
for the error messageSince stdout and stderr flow through pipes of finite size, that's a real problem. In that main read loop, nothing is reading the process's stderr pipe, which means after a little bit of stderr writing the process will block waiting for the pipe. Then
shell_run_command
will block waiting for the process to close its stdout, which will never happen, and the whole thing deadlocks.(One thing I don't understand:
anyio.open_process
is called withoutstdout
orstderr
specified, so I'd expect them not to be redirected at all, in which case.stdout
and.stderr
should beNone
and the whole thing should always fail, but clearly it doesn't, so I guess somehow they are set up to be pipes?? Certainly when running on the system they seem to be pipes.)A workaround is to use
ShellOperation
instead, which reads and logs both stdout and stderr concurrently.