Closed kapilkd13 closed 3 years ago
On running cwltool test on windows. I am getting [Errno 13] Permission denied: 'C:\\users\\kapilk~1\\appdata\\local\\temp\\tmpwhycvl
in here The reason is tempfile.NamedTemporaryFile()
opens a temporary file which we latter access without closing it. This file access using file.name
without closing file is allowed in unix but not on windows. Now if we try to close file before accessing it, file gets deleted as this is NamedTemporaryFile
normal behaviour. Currently I am passing delete=False
as a argument to this function, hence these temp files are not deleted when file is being closed. But then these temp files are not deleted and they remain in our tmp folder. Is there a better way to handle this.
Hi I made some changes to remove symlinking on a local branch https://github.com/common-workflow-language/cwltool/compare/master...kapilkd13:testing-windows?expand=1 for testing, but I am getting https://travis-ci.org/kapilkd13/cwltool/jobs/240462166 these errors. Can anyone help me with these failing test. I can't understand the reason behind these failures.
Oddly all failing test cases use tempfile.mkdtemp
to create temp directory whose docs says Creates a temporary directory in the most secure manner possible. There are no race conditions in the directory’s creation. The directory is readable, writable, and searchable only by the creating user ID.
I think it is possible that this temp dir is not searchable/writable but then how did it worked earlier with symlinks and now it doesn't.
Hey @kapilkd13 If I was debugging these errors I would use something like https://pypi.python.org/pypi/pdbpp/ to troubleshoot the tests
Thanks @mr-c Although some bugs are still remaining, I solved other bugs using pdb++.
This is related to the InlinejavascriptRequirement
issue on windows. As the windows consider stdin
,stdout
as file descriptors the current select
module do not allow us to do Non Blocking I/O. In order to overcome this issue I created a commit that used threads. here https://github.com/common-workflow-language/cwltool/pull/419/commits/69fc0fb9790d020bbd9f079b86a36781ae9f3e87. Onn initial testing on conformance test this solution looked fine. But thanks to Janneke, she pointed out a workflow on which this workflow failed. So Now the problem was:
My current patch worked and did gave output.
WorkflowException: :1:1: Expression evaluation error:
:1:1: Long-running script killed after 20 seconds: returncode was: -9
:1:1: script was:
:1:1: 01 "use strict";
:1:1: 02 var inputs = {};
:1:1: 03 var self = null;
:1:1: 04 var runtime = {
:1:1: 05 "outdirSize": 1024,
:1:1: 06 "ram": 1024,
:1:1: 07 "tmpdirSize": 1024,
:1:1: 08 "cores": 1,
:1:1: 09 "tmpdir": "/tmp/tmpgcXGd6",
:1:1: 10 "outdir": "/tmp/tmpGqSJ21"
:1:1: 11 };
:1:1: 12 (function(){return ((1+1));})()
:1:1: stdout was:
:1:1: 2
:1:1: stderr was:
Process finished with exit code 1
notice stdout value 2, it is correct so expression is evaluated correctly. The problem is that as I am using threads instead to do this I/O read. on joining them the thread gets blocked on os.read
statement as it expects input. I tried to find a way to read data without blocking/ way to check if data is in stdout
and only read if it exists. On windows I can't find a reliable way. isatty
didn't help.
There are now two possible solution that i know of.
Sol1: promote treads to daemon. Dont join them and they will die once program executes completely. But it has a catch, As @tetron told me we try to use same jsprocess
as to evaluate multiple js expression
as process is expensive. So Now if we use daemon threads when evaluating 2nd expression, thread from first expresion are still alive and would capture 2nd process output also. Its solution: use new js process
for each expression evaluation.
Sol 2: use process instead of threads. Although there will be minor speed loss we will have better handling over it. So when we think we have received output, we can terminate process. This way we can utilise single js process
.
Personally I am inclined to second method. what do you think @mr-c @anton-khodak ,Janneke
Update
Not a good news. I tried using multiprocess on windows but due to pickling I cant pass nodejs
object and when I pass nodejs.stdout
file descriptor to new process, it get closed. So I tried passing nodejs.stdout.fileno()
and tried opening it later in new process using os.fdopen
but this gave me Bad-descriptor Error
. Note that if I pass sys.stdout.fileno()
, it gets opened in new process. I can't seem to find solution to it. I made a SO question here. https://stackoverflow.com/questions/44866552/python-2-7-windowsunable-to-read-one-processnot-host-stdout-from-host-process complete description is here. Any suggestion is appreciated.
For the short term to get unblocked (hah) I suggest using Popen.communicate() to evaluate a single expression and running nodejs to completion. This will make expression evaluations expensive, however the common case of simple parameter references are already evaluated directly by cwltool.
I checked the Python standard library implementation of subprocess.communicate(). At least on Windows it uses threads to read from stdout and stderr, presumably because non-blocking reads/select are not available.
I found one stack overflow discussion about nonblocking reads. It's complicated:
https://stackoverflow.com/questions/34504970/non-blocking-read-on-os-pipe-on-windows
A longer term solution would be to provide an option that integrates a Javascript engine directly:
Thanks @tetron . For now I would go with the option to use new nodejs instance for each expression. Later I would work on the option to integrate Js engine directly. Thanks again! 😃
I have fixed InlineJs
problem with one process per expression for now. I want to discuss on a new issue that I am facing. @jvdzwaan gave me a CommandLineTool
#!/usr/bin/env cwl-runner
cwlVersion: cwl:v1.0
class: CommandLineTool
baseCommand: frog
arguments:
- prefix: --outputdir=
separate: false
valueFrom: $(runtime.outdir)
hints:
- class: DockerRequirement
dockerPull: proycon/lamachine
inputs:
- id: dir_in
type: Directory
inputBinding:
position: 1
prefix: --testdir=
separate: false
- id: skip
type: string?
inputBinding:
position: 2
prefix: --skip=
separate: false
outputs:
- id: frogout
type:
type: array
items: File
outputBinding:
glob: "*.out"
Notice outputdir
is taking value from runtime.outdir
. And later this value is used in linux docker during execution. Now Since we are working on windows, path resolution is as per windows and so runtime.outdir
contains path like C:\foo\bar
but in docker we need something like /c/foo/bar
. When passing outdir
or other usual file path to docker I convert them appropriately as I know they are path but incase of outputdir
its just a variable and I don't know if its a path or not and should I convert it or not.
One possibility is that I convert all paths(like runtime.outdir, input.path etc) to docker compatible, but since these path are used before and after execution a lot in staging, file copying, path mapping etc
and for this, these paths must be converted back to windows compatible. I don't know if this is the best solution or not.
Another idea is to make a list like pathlist
and put runtime.outdir and all other variables that contain path to it. On variable resolution if a variable is pointing to any of these the value would be converted appropriately. I haven't tried this. I want to know if this is possible.
Last solution that comes to my mind is somehow user specify that variable he wants to be resolved is a path. Then on windows we can make changes appropriately. Maybe instead of
- prefix: --outputdir=
separate: false
valueFrom: $(runtime.outdir)
something like
- prefix: --outputdir=
separate: false
valueFrom: $(runtime.outdir)
type:path
But it doesn't sound a good idea to me also 😺
what do you think @mr-c @tetron @jvdzwaan @anton-khodak what direction should i take.
The values of $(runtime.outdir)
and $(runtime.tmpdir)
and $(input.somevar.path)
must be in the environment of the tool, so when running inside a Docker container they should be /c/foo/bar
in your example.
The job of the PathMapper class is to translate the host paths (which would be Windows path, or better file:/// URIs) to the correct paths inside the container.
@tetron Alright. I would try modifying these values itself instead of the values that are being passed. I thought maybe we are using runtime.outdir
after completion to move output file to another folder. I will try and report the output. Thanks!
There's an internal outdir
variable in cwltool
, but the value of$(runtime.outdir)
itself is specifically the value from the perspective of the tool (inside the container).
Since we have achieved windows compatibility. I think we can close this.
I am still getting the same behavior with the latest cwltool and docker for windows on windows 10 pro 1809.
Even a simple sleep test (where nothing is actually written to the filesystem) cause this result:
sleep.cwl:
cwlVersion: v1.0
class: CommandLineTool
hints:
DockerRequirement:
dockerPull: ubuntu:latest
ResourceRequirement:
coresMin: 1
ramMin: 5000
tmpdirMin: 1000
baseCommand: ["sleep"]
inputs:
sleep_time:
type: int
inputBinding:
position: 1
outputs: []
Workflow:
cwlVersion: v1.0
class: Workflow
inputs:
sleep_time:
type: int
steps:
sleep:
run: "../tools/sleep.cwl"
in:
sleep_time:
source: sleep_time
out: []
outputs: []
I am getting following error:
PermissionError: [Errno 13] Permission denied: 'D:\\test\tmpledy674c'
Anyone know how to fix this?
The above error does not occur for older releases of cwltool: 1.0.20180809224403
@KerstenBreuer Thank you for your report.
Can you share the output of cwltool -debug [rest of the options]
Sure thing, the complete output is:
C:\Users\kerst\AppData\Local\Programs\Python\Python37-32\Scripts\cwltool 1.0.20181217162649
Resolved '../CWL/workflows/wf_sleep.cwl' to 'file:///D:/cwl_test_lsf/CWL/workflows/wf_sleep.cwl'
[workflow ] initialized from file:///D:/cwl_test_lsf/CWL/workflows/wf_sleep.cwl
[workflow ] start
[workflow ] {
"sleep_time": 10
}
[workflow ] starting step sleep
[job step sleep] job input {
"file:///D:/cwl_test_lsf/CWL/workflows/wf_sleep.cwl#sleep/sleep_time": 10
}
[job step sleep] evaluated job input to {
"file:///D:/cwl_test_lsf/CWL/workflows/wf_sleep.cwl#sleep/sleep_time": 10
}
[step sleep] start
[job sleep] initializing from file:///D:/cwl_test_lsf/CWL/tools/sleep.cwl as part of step sleep
[job sleep] {
"sleep_time": 10
}
[job sleep] path mappings is {}
[job sleep] command line bindings is [
{
"position": [
-1000000,
0
],
"datum": "sleep"
},
{
"position": [
1,
"sleep_time"
],
"datum": 10
}
]
[job sleep] initial work dir {}
[job sleep] Skipping Docker software container '--memory' limit despite presence of ResourceRequirement with ramMin and/or ramMax setting. Consider running with --strict-memory-limit for increased portability assurance.
[job sleep] D:\cwl_test_lsf\test_outov4h5ajz$ docker \
run \
-i \
--volume=/D/cwl_test_lsf/test_outov4h5ajz:/vvYZzU:rw \
--volume=/D/cwl_test_lsf/test_outakop6qwd:/tmp:rw \
--workdir=/vvYZzU \
--read-only=true \
--rm \
--env=TMPDIR=/tmp \
--env=HOME=/vvYZzU \
--cidfile=D:\cwl_test_lsf\test_out\tmpip6bvnmm\20190218184117-175991.cid \
ubuntu:latest \
sleep \
10
Exception while running job
Traceback (most recent call last):
File "c:\users\kerst\appdata\local\programs\python\python37-32\lib\site-packages\cwltool\job.py", line 308, in _execute
monitor_function=monitor_function
File "c:\users\kerst\appdata\local\programs\python\python37-32\lib\site-packages\cwltool\job.py", line 769, in _job_popen
monitor_function(sproc)
File "c:\users\kerst\appdata\local\programs\python\python37-32\lib\site-packages\cwltool\job.py", line 692, in docker_monitor
with open(stats_file.name, mode="w") as stats_file_handle:
PermissionError: [Errno 13] Permission denied: 'D:\\cwl_test_lsf\\test_out\\tmp9zb55c9f'
[job sleep] completed permanentFail
[job sleep] {}
[step sleep] produced output {}
[step sleep] completed permanentFail
[workflow ] completed permanentFail
[workflow ] {}
[job sleep] Removing input staging directory D:\cwl_test_lsf\test_outxafmhgjn
[job sleep] Removing temporary directory D:\cwl_test_lsf\test_outakop6qwd
{}
Final process status is permanentFail
Thanks for taking care of this.
with open(stats_file.name, mode="w") as stats_file_handle:
PermissionError: [Errno 13] Permission denied: 'D:\\cwl_test_lsf\\test_out\\tmp9zb55c9f'
Does this file exist and what permissions does it have?
No the file was not generated.
Same Problem here, any solutions so far?
I narrowed it down. The bug appears first in release 1.0.20190228155703. The last working release is 1.0.20181201184214.
We have dropped MS Windows compatibility, see https://github.com/common-workflow-language/cwltool#ms-windows-users for how to run using WSL 2.
This issue is created to discuss problems that may occur in implementing windows compatibility for cwltool