fermitools / jobsub_lite

jobsub_lite is a wrapper for HTCondor job submission
Apache License 2.0
1 stars 7 forks source link

workdir is not TMPDIR #542

Open gaponenko opened 8 months ago

gaponenko commented 8 months ago

Hello,

jobsub_submit uses the current working directory to place temporary files. This introduces unnecessary restriction that the submission directory has to be writable and has to have sufficient free space. Also, the files are routinely left over when the submission process is interrupted, another annoyance. Please use mkstemp() or equivalent to make temporary files end up in TMPDIR or /tmp.

Andrei

marcmengel commented 8 months ago

We tried that, but while experienced UNIX/Linux users are aware of setting TMPDIR and/or XDG_CACHE_HOME, many users are not, and the /tmp on many systems is quite frequently full or near full, causing launch errors.

So most of the submit files are placed in $XDG_CACHE_HOME, which defaults to $HOME/.cache, and we tried putting tarfiles in TMPDIR defaulting to /tmp, but as mentioned this got lots of complaints with /tmp being full, then we tried putting the tarfiile directory (for tardir: particularly) next to the directory being tar-ed up, but many users were tar-ing up directories they didn't own, so they couldn't write there; then we finally settled on the current directory, which is where the old jobsub scripts have put the tarfiles (for tardir: at least) for many years, and thus is backwards compatible, and many users seem to be more comfortable changing directories to submit than remembering to set TMPDIR.

Besides which the tarfile conversion is usually quite quick; I suspect yours isn't because you are using dropbox: on a /pnfs/.../persistent file .

kutschke commented 8 months ago

@marcmengel i want to understand your reply to Andrei - since RCDS, I normally send tarfiles that are located on /exp/mu2e/data. I believe that's prefered over /pnfs, right? In the old days we had to use /pnfs but those days are long past. Right?

kutschke commented 8 months ago

XDG_CACHE_HOME is not defined on our machines. Is it used for anything else? Depending on the answer, it might make sense for us to define it in our /cvmfs/mu2e.opensciencegrid.org/init.sh which is scheduled to replace /cvmfs/mu2e.opensciencegrid.org/setupmu2e-art.sh as soon as we are up and running on AL9. This will cover all use cases except people who want to use bare jobsub_submit outside of the Mu2e environment.

shreyb commented 8 months ago

In response to your last question, @kutschke: In implementing this, we followed the XDG Base Directory Specification, so that if $XDG_CACHE_HOME is not defined, then $HOME/.cache is used for the submit files.