Parsl / parsl

Parsl - a Python parallel scripting library
http://parsl-project.org
Apache License 2.0
498 stars 195 forks source link

move_files abstraction in #683 and #707 is not the right abstraction #708

Open benclifford opened 5 years ago

benclifford commented 5 years ago

683 and #707 provide a move_files parameter that stops ssh being used to copy a file over the top of itself.

this abstraction feels wrong to me.

it stems from an assumption that something submitted over a "remote" channel is also somehow on a "remote" filesystem (remote deliberately in "quotes").

in the cases that needed #683 and #707 this was not the case: job execution was "remote" but file access was "local" due to a shared file system.

annawoodard commented 5 years ago

At the moment move_files is a keyword arg to the local provider and the slurm provider-- it should at least be consistent across all?

I got burned by this recently while preparing an example of an ad-hoc config for our docs. I was testing a case with a shared filesystem and had forgotten to disable move_files. If all we're trying to solve is not copying a file on top of itself, would it be sufficient to check if the the device number and inode number of the file matches on the 'local' and 'remote' side? If so, we could remove the 'move_files' keyword arg and reduce the number of options the user needs to configure correctly.