Parsl / parsl

Parsl - a Python parallel scripting library
http://parsl-project.org
Apache License 2.0
503 stars 195 forks source link

Allowing for `inputs` and `outputs` to take immutable types #2925

Closed Andrew-S-Rosen closed 1 year ago

Andrew-S-Rosen commented 1 year ago

In looking at the Parsl docs, I see that in the Passing Data section of the BashApps, there are various functions defined with mutable kwargs, e.g. inputs=[]. Generally, this is undesirable and the Parsl docs should probably avoid promoting questionable programming practices.

In discussing this with @svandenhaute here, it seems that it's possible that Parsl is expecting type list for the inputs and outputs with regards to file passing. If that's the reason for the above behavior, perhaps there might be a way to modify things such that if None is supplied, e.g. inputs=None, then Parsl assumes this is equivalent to {}.

Is my understanding of this correct? I have never used this feature. @WardLT, this is in the section of the docs you're updating so I figured I'd tag you on this one.

WardLT commented 1 year ago

I am also nervous about lists appearing as default arguments for two reasons: subsequent functions will have undefined behavior, and parsl will not transmit changes in inputs back to the user.

@benclifford : I'm guessing mypy doesn't care about list vs tuple in this context? I might change every instance of the list to a tuple to address the first problem.

I'll add "don't expect mutable args to be mutated" into #2919 for the second problem

WardLT commented 1 year ago

Alright, I updated the documentation about mutable args. Thanks for pointing this out, @Andrew-S-Rosen !

Andrew-S-Rosen commented 1 year ago

Thanks for the clarifications, @WardLT!

benclifford commented 1 year ago

@WardLT there's a type: Sequence which you'll see in some places, that means "a thing we can iterate over but not mutate" - that's my preferred type for expressing this. It will help parsl catch code that attempts to mutate even a default empty list, for example. Both list and tuples can be used in this kind of Sequence argument context.

There's also a flake8 plugin that gives errors about mutable defaults - check out the draft flake8 bugbear PR in parsl GitHub that I have open as a low priority backburner.

@khk-globus has also made some changes littered around the codebase to move empty list default literals to empty tuple default tuples, and I think that's also a good thing to do when we're using a Sequence type.