Closed mr-c closed 7 years ago
Strictly speaking, this isn't a regression because it was never implemented for file inputs in the first place, only for document loading. But I agree it should fetch remote http resources, the main challanege is there are some caching issues to work out if you don't want to have to pull large inputs on every run.
@tetron Thanks for the clarification. I thought it was implemented from the beginning
re: file caching
possible inspiration? https://dockstore.org/docs/advanced-features#input-file-cache
Thank you for the pointer @denis-yuen Yes, we should reuse cwltool
s cachedir feature here
Is this related, or should I open a separate thread?
$ cwl-runner --validate https://github.com/CancerCollaboratory/dockstore-tool-bamstats/raw/develop/Dockstore.cwl
/usr/local/bin/cwl-runner 1.0.20170713151519
Tool definition failed initialization:
(u'https://github.com/CancerCollaboratory/dockstore-tool-bamstats/raw/develop/Dockstore.cwl', AttributeError("'HTTPResponse' object has no attribute 'chunked'",))
@standage not related (and I can't reproduce with either 1.0.20170713151519 or the latest dev 1.0.20170714133745) Can you open a separate issue with the output of pip freeze
?
I can't reproduce either. :-)
I'll just chalk it up to transient environment config weirdness.
Hi @mr-c I am thinking of two ways to do this. in Pathmapper https://github.com/common-workflow-language/cwltool/blob/master/cwltool/pathmapper.py#L219
MapperEnt
object, we can download the input over http/s into a temp file and use its path as resolved path
. creating something like path(httplink)->(temppath, targetPath)
.MapperEnt
object, download the http file content and assign it to resolved path
and setting type to CreateFile
, marking it as input on the fly.
Is there a better way/position to do this?
Personally, I like first one as it allows us to later implement caching over the downloaded file.I option 1 is the right one. CreateFile is for file literals, and stores the the data directly in memory, which won't work if the data is large. For comparison, the arvados-cwl-runner does something similar, although for uploading local files to the server rather than downloading locally, but the principal is the same:
https://github.com/curoverse/arvados/blob/master/sdk/cwl/arvados_cwl/pathmapper.py#L136
@mr-c Can we close this
Yep! To get an issue to automatically close when a PR is merged, end the Pull Request description with Closes: #NNN
Expected Behavior
a URI should be accepted for inputs with
type: File
http://www.commonwl.org/v1.0/CommandLineTool.html#File
Actual Behavior
Workflow Code
Full Traceback