Open joeflack4 opened 2 days ago
TL;DR: We can’t support spaces in paths while remaining compatible with a POSIX shell.
The interesting bit in the run.sh
script is here:
BIND_OPTIONS="-v $(echo $VOLUME_BIND | sed 's/,/ -v /')"
docker run $ODK_DOCKER_OPTIONS $BIND_OPTIONS [...]
The first line transforms the VOLUME_BIND
variable from a comma-separated list of pairs of directories to bind (HOST_DIRECTORY1:CONTAINER_DIRECTORY1,HOST_DIRECTORY2:CONTAINER_DIRECTORY2,...
) into a list of -v
options as expected by Docker (-v HOST_DIRECTORY1:CONTAINER_DIRECTORY1 -v HOST_DIRECTORY2:CONTAINER_DIRECTORY2 ...
). On the second line, the transformed BIND_OPTIONS
variable is passed to the docker run
command.
This supposes that the arguments to the -v
options (all HOST_DIRECTORY:CONTAINER_DIRECTORY
pairs) never contain any space characters.
If HOST_DIRECTORY
happens to contain a space character (say, HOST DIRECTORY
), then we end up with -v HOST DIRECTORY:CONTAINER_DIRECTORY
, where only HOST
is interpreted as the argument to -v
, while DIRECTORY:CONTAINER_DIRECTORY
is a completely independent positional argument (that docker run
will interpret as the name of the image to run).
Enclosing $BIND_OPTIONS
in quotes (to prevent the shell from splitting it on the space characters) is not an option here, because it will prevent the shell from splitting it on the space characters, and we need the shell to do that because the HOST_DIRECTORY:CONTAINER_DIRECTORY
pairs in BIND_OPTIONS
must be passed to docker run
as separate arguments.
One option is to use an array variable:
BIND_OPTIONS="-v $(echo $VOLUME_BIND | sed 's/,/,-v/')"
IFS=, BIND_OPTIONS_ARRAY=(BIND_OPTIONS)
unset IFS
docker run $ODK_DOCKER_OPTIONS "${BIND_OPTIONS_ARRAY[@]}" [...]
The first line transforms HOST_DIRECTORY1:CONTAINER_DIRECTORY1,HOST_DIRECTORY2:CONTAINER_DIRECTORY2,...
into -vHOST_DIRECTORY1:CONTAINER_DIRECTORY1,-vHOST_DIRECTORY2:CONTAINER_DIRECTORY2,...
(that is, it simply inserts a -v
before each pair).
Then we ask the shell to create an array (this is the non-POSIX-compliant part; the POSIX shell specification has no concept of array variables) by splitting the BIND_OPTIONS
variable on the ,
character, by setting the word splitting parameter (IFS
) to ,
.
Then we need to reset the IFS so that the shell will resume to splitting lines on space characters as normal.
And then, finally, we pass the array to the docker run
command. Here we can enclose it within double quotes, because when used around a reference to an array the shell will actually put quotes around each item in the array, rather than around the entire array. That is, we will end up with "-vHOST_DIRECTORY1:CONTAINER_DIRECTORY1" "-vHOST_DIRECTORY2:CONTAINER_DIRECTORY2"
, which is exactly what we need and will work as expected even if there are space characters somewhere in the directory paths.
It might take me some time to fully read and process this, but that looks like a really thorough investigation and explanation!
If this can't be fixed, then I would suggest an alternative solution: graceful error. If detecting a space in the path, we throw an error saying to correct that; that it is incompatible.
That would help a lot. It took me a while to figure out what the problem was. Can we do this?
If this can't be fixed
It can be fixed if we decide to drop the requirement for the run.sh
script to be POSIX-compliant – something that, for the record, I am in favour of.
The more I think about it, the less I am convinced that POSIX compatibility makes sense for the ODK, and especially for the Docker wrapper script. That thing is entirely dependent on, well, Docker, which is already a Linux-specific technology in the first place (it works on Windows and macOS only through virtualisation of a Linux machine).
If this can't be fixed, then I would suggest an alternative solution: graceful error. If detecting a space in the path, we throw an error saying to correct that; that it is incompatible.
Until/unless we make a decision to drop POSIX compatibility as a requirement, yes, we can do that and I agree it would be a good thing to do.
Overview
I had an ODK repository with spaces in the path (https://github.com/monarch-initiative/mondo-ingest/issues/545), and this resulted in an error when using
run.sh
.The error
docker: invalid reference format
Log
``` joeflack4@MacBook-Pro ontology % sh run.sh make build-mondo-ingest Running obolibrary/odkfull:dev with '-Xmx20G' as options for ROBOT and other Java-based pipeline steps. docker: invalid reference format. See 'docker run --help'. ```
Replicability
Try on an ODK rep with a path that includes a space in it, e.g. this example with dir
mondo builds/
, error:Remove the space, e.g. in this example changing the dir name to
mondo-builds/
, and observe the issue disappear.Additional information
Original context: