common-workflow-language / cwltool

Common Workflow Language reference implementation
https://cwltool.readthedocs.io/
Apache License 2.0
332 stars 230 forks source link

cwltool enumerates every single file and folder in the output directory, unneccesarily #561

Closed mr-c closed 5 years ago

mr-c commented 6 years ago

https://github.com/common-workflow-language/cwltool/blob/734322eb5677717c5836eed361823f0a585e5de1/cwltool/process.py#L310

To reproduce: run cwltool --outdir ${HOME} and a simple CWL description on real user's system

Perhaps a single temporary folder should be created within outdir where the intermediate output directories are then stored under? That way this loop would iterate over just that single folder instead of walking the entire tree from outdir.

esanzgar commented 6 years ago

Are you sure is that line that causes the issue?

I believe it might be this other: https://github.com/common-workflow-language/cwltool/blob/19f2cb6e21db8624155c7e253b89c57df536fcc1/cwltool/process.py#L334

This code needs urgent revision, it causes a huge performance penalty.

mr-c commented 6 years ago

@esanzgar you are correct, I didn't use a permalink so the line numbers shifted. I've updated the issue, thanks!

psafont commented 6 years ago

Work happening on https://github.com/common-workflow-language/cwltool/pull/850

psafont commented 5 years ago

Solved with https://github.com/common-workflow-language/cwltool/pull/926