Currently it's possible to trip up by downloading a GCS "foo/" entry as a file "foo", which then breaks attempts by Sisyphus or by the worker task itself to create files or subdirectories in that "directory." Then one can spend a lot of time debugging.
The Step and the Command together specify each input and output path:
Step foo, Command foo: upload/download a file
Step foo/, Command foo: illegal combination
Step foo, Command foo/: upload/download an entire directory tree
Step foo/, Command foo/: illegal combination
The Workflow builder always constructs the Step path without a trailing /, so those workflows are fine.
In Sisyphus
cloud/download-blob! is overly-forgiving about whether the remote or local path ends with /.
cloud/download-tree! assumes it's given a key that doesn't end with / which names a directory entry that does end with /. If the key does end with / it will find the same directory entry (and perhaps more) and then throw an IndexOutOfBoundsException (IIRC) trying to strip off the key + '/' "preamble".
I think the calling code distinguishes which one of these to call by whether the local path ends with /.
I don't know how to reproduce today's nasty case with the Sisyphus code:
sisyphus.core=> (cloud/download! storage "sisyphus" "data/jerry/20190817.114027/metadata" "/tmp/metadata")
ERROR: file unavailable to download sisyphus:data/jerry/20190817.114027/metadata
nil
sisyphus.core=> (cloud/download! storage "sisyphus" "data/jerry/20190817.114027/metadata/" "/tmp/metadata")
download data/jerry/20190817.114027/metadata/ data/jerry/20190817.114027/metadata/ to /tmp/metadata
true ; downloaded just the metadata/ directory, as a directory
sisyphus.core=> (cloud/download! storage "sisyphus" "data/jerry/20190817.114027/metadata" "/tmp/metadata/")
ERROR: file unavailable to download sisyphus:data/jerry/20190817.114027/metadata
nil
sisyphus.core=> (cloud/download! storage "sisyphus" "data/jerry/20190817.114027/metadata/" "/tmp/metadata/")
download data/jerry/20190817.114027/metadata/ data/jerry/20190817.114027/metadata/ to /tmp/metadata/
true ; downloaded just the metadata/ directory, as a directory
sisyphus.core=> (cloud/download-tree! storage "sisyphus" "data/jerry/20190817.114027/metadata" "/tmp/metadata")
nil ; downloaded the tree
sisyphus.core=> (cloud/download-tree! storage "sisyphus" "data/jerry/20190817.114027/metadata/" "/tmp/metadata")
Execution error (StringIndexOutOfBoundsException) at java.lang.String/substring (String.java:1931).
String index out of range: -1
sisyphus.core=> (cloud/download-tree! storage "sisyphus" "data/jerry/20190817.114027/metadata" "/tmp/metadata/")
nil ; downloaded the tree
sisyphus.core=> (cloud/download-tree! storage "sisyphus" "data/jerry/20190817.114027/metadata/" "/tmp/metadata/")
Execution error (StringIndexOutOfBoundsException) at java.lang.String/substring (String.java:1931).
String index out of range: -1
Proposals:
Consider simplifying the above rule so the local and remote path endings must match, that is, either both end with / or neither ends with /. That's easier to remember and makes Steps clear about which paths are directories, just like Commands are. Manage the transition.
Both Gaia and Sisyphus should reject illegal combinations as early as they can, giving a helpful error message so people can quickly fix the problem.
Currently it's possible to trip up by downloading a GCS "foo/" entry as a file "foo", which then breaks attempts by Sisyphus or by the worker task itself to create files or subdirectories in that "directory." Then one can spend a lot of time debugging.
The Step and the Command together specify each input and output path:
foo
, Commandfoo
: upload/download a filefoo/
, Commandfoo
: illegal combinationfoo
, Commandfoo/
: upload/download an entire directory treefoo/
, Commandfoo/
: illegal combinationThe Workflow builder always constructs the Step path without a trailing
/
, so those workflows are fine.In Sisyphus
cloud/download-blob!
is overly-forgiving about whether the remote or local path ends with/
.cloud/download-tree!
assumes it's given a key that doesn't end with/
which names a directory entry that does end with/
. If the key does end with/
it will find the same directory entry (and perhaps more) and then throw an IndexOutOfBoundsException (IIRC) trying to strip off the key + '/' "preamble"./
.I don't know how to reproduce today's nasty case with the Sisyphus code:
Proposals:
/
or neither ends with/
. That's easier to remember and makes Steps clear about which paths are directories, just like Commands are. Manage the transition.