htcondor / htmap

High-Throughput Computing in Python, powered by HTCondor
https://htmap.readthedocs.io
Apache License 2.0
32 stars 10 forks source link

Allow arbitrary protocols in output transfer #214

Closed JoshKarpel closed 4 years ago

JoshKarpel commented 4 years ago

https://github.com/htcondor/htmap/issues/187 is about supporting arbitrary protocols for input file transfer. That case is much easier to work with because we already know how to transform transfer specifications into local file paths execute-side, and because the input files are implicitly known ahead-of-time.

This is not the case for output transfer. To prevent users from having to know the names of output files ahead-of-time, htmap.transfer_output_files allows users to specify which files to transfer back while the job is running. This involves a workaround where we actually list a specific HTMap-controlled directory in transfer_output_files and move the files into that directory, because we can't modify the condor_starter's copy of the job ad (which controls output transfer) from inside the job. This strategy won't work for protocol transfers, because they involve a two-step procedure: name the file in transfer_output_files, then name the remap in transfer_output_remaps. Namely, the job ad needs to have the full remap specification in it before the job starts. This restriction will almost surely mean that we need to inflict the same restriction on HTMap users.

Current plan: scheme about what the interface for this looks like.

Tidbits:

  1. I think users will still need to call htmap.transfer_output_files on files they plan to remap.
  2. Users should not need to write transfer_output_files specifications directly (in fact, we can't let them, because we need to control its format internally).
toddlmiller commented 4 years ago

submit file

transfer_executable     = true
should_transfer_files   = true
universe                = vanilla
arguments               = 60

transfer_input_files    = input-file
log                     = sleep/log

transfer_output_files   = tmp.d
transfer_output_remaps  = "tmp.d/foot.txt=file:///tmp/foo.txt"

queue 1

StarterLog

04/24/20 14:53:46 JICShadow::transferOutput(void): Transferring...
04/24/20 14:53:46 Entering FileTransfer::InitDownloadFilenameRemaps
04/24/20 14:53:46 FileTransfer: output file remaps: tmp.d/foot.txt=file:///tmp/foo.txt;log=/.old/space/tlmiller/alt.condor/test/sleep/log
04/24/20 14:53:46 Begin transfer of sandbox to shadow.
04/24/20 14:53:46 entering FileTransfer::UploadFiles (final_transfer=1)
04/24/20 14:53:46 SharedPortClient: sent connection request to daemon at <184.60.25.78:39713> for shared port id shadow_23424_13b2_2
04/24/20 14:53:46 FileTransfer::UploadFiles: sent TransKey=1#5ea343c978d09e13fa5c813
04/24/20 14:53:46 entering FileTransfer::Upload
04/24/20 14:53:46 entering FileTransfer::DoUpload
04/24/20 14:53:46 DoUpload: Output URL plugins will be run
04/24/20 14:53:46 DoUpload: Tag to use for data reuse: tlmiller@azaphrael.org
04/24/20 14:53:46 REMAP: begin with rules: tmp.d/foot.txt=file:///tmp/foo.txt;log=/.old/space/tlmiller/alt.condor/test/sleep/log
04/24/20 14:53:46 REMAP: 0: tmp.d
04/24/20 14:53:46 REMAP: begin with rules: tmp.d/foot.txt=file:///tmp/foo.txt;log=/.old/space/tlmiller/alt.condor/test/sleep/log
04/24/20 14:53:46 REMAP: 0: tmp.d/foo.txt
04/24/20 14:53:46 REMAP: 1: tmp.d
04/24/20 14:53:46 DoUpload: sending file tmp.d
04/24/20 14:53:46 DoUpload: will transfer to filename tmp.d.
04/24/20 14:53:46 Will upload output URL using single-file plugin.
04/24/20 14:53:46 FILETRANSFER: outgoing file_command is 6 for tmp.d
04/24/20 14:53:46 Received GoAhead from peer to send /space/tlmiller/alt.condor/install/local/execute/dir_24403/tmp.d and all further files.
04/24/20 14:53:46 Sending GoAhead for 184.60.25.78 to receive /space/tlmiller/alt.condor/install/local/execute/dir_24403/tmp.d and all further files.
04/24/20 14:53:46 DoUpload: sending file tmp.d/foo.txt to tmp.d/
04/24/20 14:53:46 DoUpload: will transfer to filename tmp.d/foo.txt.
04/24/20 14:53:46 Will upload output URL using single-file plugin.
04/24/20 14:53:46 FILETRANSFER: outgoing file_command is 1 for tmp.d/foo.txt
04/24/20 14:53:46 ReliSock::put_file_with_permissions(): going to send permissions 100644
04/24/20 14:53:46 put_file: going to send from filename /space/tlmiller/alt.condor/install/local/execute/dir_24403/tmp.d/foo.txt
04/24/20 14:53:46 put_file: Found file size 0
04/24/20 14:53:46 put_file: sending 0 bytes
04/24/20 14:53:46 ReliSock: put_file: sent 0 bytes
04/24/20 14:53:46 DoUpload: exiting at 4681
04/24/20 14:53:46 End transfer of sandbox to shadow.

Result

tmp.d/foo.text created where job was submitted.

toddlmiller commented 4 years ago

sleep/sleep.sh:

#!/bin/bash
mkdir tmp.d
touch tmp.d/foo.txt
exit 0
JoshKarpel commented 4 years ago

Followup: the above was broken due to typos; it does actually work as hoped! The remap will occur as long as the file is transferred, even if it isn't explicitly named in transfer_output_files. So we only need to handle the remaps.

JoshKarpel commented 4 years ago

Resolved by #196