Closed mlincett closed 1 year ago
No files need to be transferred from the client to the server, since the data transfer is all done through the MQ. Perhaps this is because of your condor submit script? SkyDriver doesn't transfer files back to the submitter's machine.
As for cleanup after a worker finishes. I don't believe we need to care about that. However, if you don't use the debug directory option, then fewer files will be written. SkyDriver won't use this option by default.
Is this the should_transfer_files=YES
line?
Is this the
should_transfer_files=YES
line?
Correct. If the option is active (which is necessary to transfer any input file), all "new" files will be transferred out, by default to the submit node.
As a workaround it is possible to set a custom URL as destination for output files. If the URL is invalid / does not allow a POST then the files are simply "lost" (but the jobs may be "held" as they don't finish gracefully).
The client.py
passes an empty string as debug_directory
to EWMS pilot. I suppose this results in EWMS pilot writing to the current working directory.
There is an option in EWMS pilot to choose whether or not keeping the debug directory, but it seems this is not used by client.py
.
Ideally these files should be kept in "small" debug runs but should be thrown away for "grid-scale" runs.
I see. Thanks for looking into this. This sounds like a bug. I can investigate further.
When a client condor job finishes, the JSON and pickle files are transferred back as per standard HTCondor behaviour (transfer all new files).
This results in a large number of files that are unnecessarily transferred and may pollute the destination filesystem.
I am just noticing this now and I am not sure if this behaviour has changed. In particular, pickle files are dumped to local storage by
ewms-pilot
.I think either the client of
ewms-pilot
should take care of cleaning up after themselves.