Open sam-baird opened 2 weeks ago
loop iterate over the associative array running
gsutil cp
, redirect stderr to a variable,echo
the variable, and do the above check.
This will probably be a little complicated because you probably need a different set of commands and checks depending on the overwrite
variable. For example if overwrite
is true we would want to run gsutil ls
first to see if the file exists then error out if it does not because this variable was likely set in error.
Look at H5 repo as an example for looping over files and destinations
Feature Request
Files can sometimes accidentally be overwritten when transferring outputs. Sometimes this is intentional but usually not (for example when running tests on old data and forgetting to change the output path). There should be checks in place to make not overwriting the default behavior.
Solution
The
-n
flag ingsutil cp
prevents overwriting existing files. We can determine whether to overwrite existing files using anoverwrite
boolean input variable (with default setting of false). Check for consistency between theoverwrite
variable and whether file exists. Error out ifoverwrite
is false but the file already exists, or ifoverwrite
is true but a file does not already exist.The
-n
option writes to stderrSkipping existing item...
if file already exists, and we can use this output to do the above check. To avoid too much repetitive code, we can create a bash associative array with each file source mapped to each destination. Then have a loop iterate over the associative array runninggsutil cp
, redirect stderr to a variable,echo
the variable, and do the above check.Upstream effects
overwrite = false
in input JSON to prevent accidental overwrite if previous analysis run's setting were changed tooverwrite = true
Downstream effects
None.