vatlab / sos

SoS workflow system for daily data analysis
http://vatlab.github.io/sos-docs
BSD 3-Clause "New" or "Revised" License
274 stars 45 forks source link

Requiring shared directory for remote execution of tasks #1535

Closed BoPeng closed 7 months ago

BoPeng commented 8 months ago

Currently the sos task execution system is complicated because it needs to copy files around if files reside on directories that are not shared by the job submission machine and the head node. A lot of work, such as signature checking, directory mapping, and file synchronization has been done to enable a powerful, yet complicated and confusing system that can easily break.

I propose that we remove all the file mapping feature and require file systems to be shared. In this way input and output files have to exist on both local and remote file systems, and there is no need to map directories and copy files around. The system will be much easier to understand, configure, and a lot more robust.

gaow commented 8 months ago

Agreed, considering most of our user cases so far are the simpler case of completely local or remote.

BoPeng commented 8 months ago

One problem with this approach is that the $HOME directory on different systems can be different (although there can be shared mounts). It is then necessary to allow the specification of ~/.sos/tasks to a directory that is shared. Because it can be a bad idea to mix jobs executed on different hosts, the tasks directory should better be host-specific.

For backward compatibility, let us keep tasks to home directory. In this way, sos installed on remote servers does not need to be updated.

We need to

  1. Confirm if input file is on one of the shared directories.
  2. Confirm if workdir etc are on one of the shared directories.
BoPeng commented 7 months ago

Done, with a new version of sos released. #1536