Parsl / parsl

Parsl - a Python parallel scripting library
http://parsl-project.org
Apache License 2.0
509 stars 195 forks source link

Rate Limiting for Invoking `qstat` or related commands #3035

Open WardLT opened 10 months ago

WardLT commented 10 months ago

Is your feature request related to a problem? Please describe. There are system conditions which can lead to Parsl requesting qstat far too frequently, which causes issues for other users of the HPC cluster. I'm not sure what caused it, but I recently found Parsl to submit a few hundred qstat calls per second with the PBSProProvider.

Describe the solution you'd like The ability to limit qstat calls to a specific rate.

Describe alternatives you've considered None.

Additional context PBSProProvider with HTEx, running tasks in a "keep a constant number of tasks in queue" modality.

Parsl: 2023.12.18

benclifford commented 10 months ago

after some investigation with Logan and @cms21 it's not clear that parsl is actually invoking qstat very frequently: the actual evidence is of network traffic, not of command-line invocation. I gave @WardLT a branch of Parsl with tighter logging around localchannel execute_wait to get more nuanced logging around PBS command invocations, which I hope can give a bit more of a view of what's happening at the command-line level and be correlated a bit better with network API logs.

WardLT commented 10 months ago

Thanks, @benclifford . I'll keep you posted if I encounter network traffic issues again