Closed pkp124 closed 2 years ago
This snippet is where the file path is converted to URL lines 28-36:
class StdFsAccess(object):
def __init__(self, basedir): # type: (Text) -> None
self.basedir = basedir
def _abs(self, p): # type: (Text) -> Text
return abspath(p, self.basedir)
def glob(self, pattern): # type: (Text) -> List[Text]
return [file_uri(str(self._abs(l))) for l in glob.glob(self._abs(pattern))]
file_uri
is imported from schema_salad.ref_resolver
In ref_resolver.py
, file_uri calls urllib.request.pathname2url.
The call to pathname2url
is causing @
to be converted to %40
and spaces converted to %20
.
The issue is the conversion happens after the CWL workflow is complete and moving files to the output path.
In the example above, /tmp/tmpc3_Od4/a b
exists but CWLTool is looking for /tmp/tmpc3_Od4/a%20b
.
I can modify line 36 in stdfsaccess.py
from
return [file_uri(str(self._abs(l))) for l in glob.glob(self._abs(pattern))]
to
return [urllib.request.url2pathname(file_uri(str(self._abs(l)))) for l in glob.glob(self._abs(pattern))]
and this workflow will complete successfully.
I'm not sure that's the best fix. It seems like a fix in schema_salad
ref_resolver.py
would be more appropriate.
Is there a specific reason pathname2url
is used in StdFsAccess .glob
in stdfsaccess.py
? @tetron @mr-c
This is not limited to spaces, characters like *
also cause the problem.
As of cwltool
version 3.1.20211107152837 this works (likely earlier versions as well)
Spaces present in output file names results in error.
cwl file: touch.cwl
json input: touch.json
command line:
Debug log:
cwltool version: 1.0.20180622214234
The output file is created. When collecting the output file, the space character is replaced with "%20", which i assume is because of the call to file_uri() in StdFsAccess class, glob() method. I am not sure if this is expected behaviour or a bug.