inab / WfExS-backend

Workflow Execution Service Backend
Apache License 2.0
16 stars 6 forks source link

Error while running WfExS using a local workflow file/directory #46

Open dcl10 opened 1 year ago

dcl10 commented 1 year ago

Description

I am running WfExS with the config files shown below. When I run WfExS-backend.py -L local-config.yml stage -W test-stage.yml I get the following error: NotADirectoryError: [Errno 20] Not a directory: '/root/wfexs-backend-test_WorkDir/47761fdd-f06f-4260-a1f3-7351265805b3/workflow'.

Looking at the path in the error message, it seems workflow is the file in workflow_id in the stage file. However, WfExS is expecting there to be a directory. I also tried putting a path to a directory in the workflow_id field, but that failed saying it couldn't work out which runner to use.

Traceback

Traceback (most recent call last): File "/root/WfExS-backend/WfExS-backend.py", line 21, in main() File "/root/WfExS-backend/wfexs_backend/main.py", line 1122, in main stagedSetup = wfInstance.stageWorkDir() File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1985, in stageWorkDir self.materializeWorkflowAndContainers(offline=offline, ignoreCache=ignoreCache) File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1233, in materializeWorkflowAndContainers self.setupEngine(offline=offline, ignoreCache=ignoreCache) File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1191, in setupEngine self.fetchWorkflow( File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1152, in fetchWorkflow engineVer, candidateLocalWorkflow = engine.identifyWorkflow( File "/root/WfExS-backend/wfexs_backend/cwl_engine.py", line 316, in identifyWorkflow newLocalWf = self._enrichWorkflowDeps(newLocalWf, engineVer) File "/root/WfExS-backend/wfexs_backend/cwl_engine.py", line 542, in _enrichWorkflowDeps with subprocess.Popen( File "/usr/lib/python3.10/subprocess.py", line 969, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.10/subprocess.py", line 1845, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) NotADirectoryError: [Errno 20] Not a directory: '/root/wfexs-backend-test_WorkDir/47761fdd-f06f-4260-a1f3-7351265805b3/workflow'

Settings

Stage file

# test-stage.yml
workflow_id: file:///root/hutch/workflows/sec-hutchx86.cwl
workflow_config:
  container: 'docker'
  secure: false
nickname: 'vas-workflow'
cacheDir: /tmp/wfexszn6siq2jtmpcache
crypt4gh:
  key: cosifer_test1_cwl.wfex.stage.key
  passphrase: mpel nite ified g
  pub: cosifer_test1_cwl.wfex.stage.pub
outputs:
  output_file:
    c-l-a-s-s: File
    glob: "output.json"
params:
  body:
    c-l-a-s-s: File
    url:
      - https://raw.githubusercontent.com/HDRUK/hutch/main/workflows/inputs/rquest-query.json
  is_availability: true
  db_host: "localhost"
  db_name: "hutch"
  db_user: "postgres"
  db_password: "example"

Local config

# local-config.yml
cacheDir: ./wfexs-backend-test
crypt4gh:
  key: local_config.yaml.key
  passphrase: strive backyard dividing gumball
  pub: local_config.yaml.pub
tools:
  containerType: docker
  dockerCommand: docker
  encrypted_fs:
    command: encfs
    type: encfs
  engineMode: local
  gitCommand: git
  javaCommand: java
  singularityCommand: singularity
  staticBashCommand: bash-linux-x86_64
workDir: ./wfexs-backend-test_WorkDir
dcl10 commented 1 year ago

Hi @jmfernandez. I hope you had a nice weekend. As per your email, I've tried WfExS with the files above. I found 2 thing:

  1. You need to add groovy_parser, lark and pygments to your requirements.txt file.
  2. I got the following traceback when attempting to stage the workflow:
    
    2023-07-10 09:42:38,814 - [ERROR] Failed to parse initial file sec-hutchx86.cwl with groovy parser
    Traceback (most recent call last):
    File "/root/WfExS-backend/wfexs_backend/nextflow_engine.py", line 383, in identifyWorkflow
    ) = analyze_nf_content(firstPathContent, only_names=only_names)
    File "/root/WfExS-backend/wfexs_backend/utils/groovy_parsing.py", line 643, in analyze_nf_content
    t_tree = parse_and_digest_groovy_content(content)
    File "/root/WfExS-backend/wfexs_backend/utils/groovy_parsing.py", line 633, in parse_and_digest_groovy_content
    tree = parse_groovy_content(content)
    File "/root/WfExS-backend/.pyWEenv/lib/python3.10/site-packages/groovy_parser/parser.py", line 158, in parse_groovy_content
    raise pe
    File "/root/WfExS-backend/.pyWEenv/lib/python3.10/site-packages/groovy_parser/parser.py", line 153, in parse_groovy_content
    tree = parser.parse(
    File "/root/WfExS-backend/.pyWEenv/lib/python3.10/site-packages/lark/lark.py", line 645, in parse
    return self.parser.parse(text, start=start, on_error=on_error)
    File "/root/WfExS-backend/.pyWEenv/lib/python3.10/site-packages/lark/parser_frontends.py", line 96, in parse
    return self.parser.parse(stream, chosen_start, **kw)
    File "/root/WfExS-backend/.pyWEenv/lib/python3.10/site-packages/lark/parsers/earley.py", line 266, in parse
    to_scan = self._parse(lexer, columns, to_scan, start_symbol)
    File "/root/WfExS-backend/.pyWEenv/lib/python3.10/site-packages/lark/parsers/earley.py", line 237, in _parse
    to_scan = scan(i, token, to_scan)
    File "/root/WfExS-backend/.pyWEenv/lib/python3.10/site-packages/lark/parsers/earley.py", line 214, in scan
    raise UnexpectedToken(token, expect, considered_rules=set(to_scan), state=frozenset(i.s for i in to_scan))
    lark.exceptions.UnexpectedToken: Unexpected token Token('FLOATING_POINT_LITERAL', (Token.Literal.Number.Float, '0', '0')) at line 1, column 15.
    Expected one of: 
        * WHILE
        * THREADSAFE
        * NL
        * SUPER
        * BOOLEAN
        * FLOAT
        * FOR
        * CLASS
        * INT
        * TRAIT
        * SWITCH
        * ELSE
        * GOTO
        * IMPORT
        * RETURN
        * TRANSIENT
        * ABSTRACT
        * PRIVATE
        * NATIVE
        * AT
        * INTERFACE
        * EXTENDS
        * IMPLEMENTS
        * DO
        * DEFAULT
        * THIS
        * CAPITALIZED_IDENTIFIER
        * FINALLY
        * THROWS
        * IN
        * AS
        * LPAREN
        * FINAL
        * CATCH
        * SYNCHRONIZED
        * LONG
        * VOLATILE
        * BOOLEAN_LITERAL
        * IF
        * CHAR
        * CONST
        * BREAK
        * CONTINUE
        * PUBLIC
        * INSTANCEOF
        * ENUM
        * VOID
        * STATIC
        * NULL_LITERAL
        * PROTECTED
        * CASE
        * BYTE
        * DOUBLE
        * DEF
        * STRICTFP
        * VAR
        * PACKAGE
        * STRING_LITERAL
        * ASSERT
        * LT
        * THROW
        * SHORT
        * GSTRING_BEGIN
        * TRY
        * IDENTIFIER
        * NEW

Traceback (most recent call last): File "/root/WfExS-backend/WfExS-backend.py", line 21, in main() File "/root/WfExS-backend/wfexs_backend/main.py", line 1122, in main stagedSetup = wfInstance.stageWorkDir() File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1995, in stageWorkDir self.materializeWorkflowAndContainers(offline=offline, ignoreCache=ignoreCache) File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1243, in materializeWorkflowAndContainers self.setupEngine(offline=offline, ignoreCache=ignoreCache) File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1201, in setupEngine self.fetchWorkflow( File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1176, in fetchWorkflow raise WFException( wfexs_backend.workflow.WFException: No engine recognized a workflow at file:///root/hutch/workflows/sec-hutchx86.cwl

dcl10 commented 1 year ago

In addition to the above comment, I still cannot run WfExS with an RO-Crate directly from disk. I downloaded this workflow which is same one as the file example I'm trying to run. I get a similar error to above, where is says no engine recognised.

2023-07-10 09:55:19,258 - [WARNING] Unable to process CWL entrypoint /root/wfexs-backend-test_WorkDir/72a0e402-9336-478c-8693-f1a71ffa6f5b/workflow [Errno 21] Is a directory: '/root/wfexs-backend-test_WorkDir/72a0e402-9336-478c-8693-f1a71ffa6f5b/workflow'
Traceback (most recent call last):
  File "/root/WfExS-backend/WfExS-backend.py", line 21, in <module>
    main()
  File "/root/WfExS-backend/wfexs_backend/__main__.py", line 1122, in main
    stagedSetup = wfInstance.stageWorkDir()
  File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1995, in stageWorkDir
    self.materializeWorkflowAndContainers(offline=offline, ignoreCache=ignoreCache)
  File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1243, in materializeWorkflowAndContainers
    self.setupEngine(offline=offline, ignoreCache=ignoreCache)
  File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1201, in setupEngine
    self.fetchWorkflow(
  File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1176, in fetchWorkflow
    raise WFException(
wfexs_backend.workflow.WFException: No engine recognized a workflow at file:///root/test_wfexs_dir
jmfernandez commented 1 year ago

Hi @jmfernandez. I hope you had a nice weekend. As per your email, I've tried WfExS with the files above. I found 2 thing:

1. You need to add `groovy_parser`, `lark` and `pygments` to your `requirements.txt` file.

Hi @dcl10, did you pull this morning all the changes and updated the requirements.txt ? Because this last was updated in the commit from 15 hours ago, https://github.com/inab/WfExS-backend/blob/fde81586ec6eeff817eb50b7d980ab29c6c36659/requirements.txt#L24

jmfernandez commented 1 year ago

Thanks for the feedback, I'm trying this one later.

In addition to the above comment, I still cannot run WfExS with an RO-Crate directly from disk. I downloaded this workflow which is same one as the file example I'm trying to run. I get a similar error to above, where is says no engine recognised.

2023-07-10 09:55:19,258 - [WARNING] Unable to process CWL entrypoint /root/wfexs-backend-test_WorkDir/72a0e402-9336-478c-8693-f1a71ffa6f5b/workflow [Errno 21] Is a directory: '/root/wfexs-backend-test_WorkDir/72a0e402-9336-478c-8693-f1a71ffa6f5b/workflow'
Traceback (most recent call last):
  File "/root/WfExS-backend/WfExS-backend.py", line 21, in <module>
    main()
  File "/root/WfExS-backend/wfexs_backend/__main__.py", line 1122, in main
    stagedSetup = wfInstance.stageWorkDir()
  File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1995, in stageWorkDir
    self.materializeWorkflowAndContainers(offline=offline, ignoreCache=ignoreCache)
  File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1243, in materializeWorkflowAndContainers
    self.setupEngine(offline=offline, ignoreCache=ignoreCache)
  File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1201, in setupEngine
    self.fetchWorkflow(
  File "/root/WfExS-backend/wfexs_backend/workflow.py", line 1176, in fetchWorkflow
    raise WFException(
wfexs_backend.workflow.WFException: No engine recognized a workflow at file:///root/test_wfexs_dir
dcl10 commented 1 year ago

Hi @jmfernandez. I hope you had a nice weekend. As per your email, I've tried WfExS with the files above. I found 2 thing:

1. You need to add `groovy_parser`, `lark` and `pygments` to your `requirements.txt` file.

Hi @dcl10, did you pull this morning all the changes and updated the requirements.txt ? Because this last was updated in the commit from 15 hours ago,

https://github.com/inab/WfExS-backend/blob/fde81586ec6eeff817eb50b7d980ab29c6c36659/requirements.txt#L24

Hi @jmfernandez, I pulled this morning but I pulled the wrong tag haha. Thanks for pointing this out :)

jmfernandez commented 1 year ago

Hi again!

I have identified the source of the groovy-parser issue you found when it could be reproduced by @paulaidt in a fresh installation. As I described at f269559548189c1d24c24b2d5f26f5e0fa97856f commit, the package was not properly built because it did not have the list of dependencies when the whl was built.

jmfernandez commented 1 year ago

Hi again (again), I have been trying to reproduce what it is happening, and one hidden issue was that WfExS was not reporting the workflow mismatches, i.e. when an engine was raising exceptions due problems detecting a valid workflow. After the changes of commits 519161394e1429aca8162865d84616efdabeceff and e166d1ebf6b334e4b825b772c490628e9397a6e8 , now it is reporting the next (I'm including here a small relevant fragment):

2023-07-12 22:11:12,400 - [CWLWorkflowEngine _enrichWorkflowDeps 558][DEBUG] /home/jmfernandez/projects/WfExS-backend/workflow_examples/HUTCH/46/wfexs-backend-test/CWLWorkflowEngine/3.1.20230601100705 --print-deps => 
2023-07-12 22:11:12,401 - [wfexs_backend.workflow::WF fetchWorkflow 1174][ERROR] Engine CWL did not recognize the workflow as a valid one. Reason:
Traceback (most recent call last):
  File "/home/jmfernandez/projects/WfExS-backend/wfexs_backend/workflow.py", line 1162, in fetchWorkflow
    engineVer, candidateLocalWorkflow = engine.identifyWorkflow(
  File "/home/jmfernandez/projects/WfExS-backend/wfexs_backend/cwl_engine.py", line 319, in identifyWorkflow
    newLocalWf = self._enrichWorkflowDeps(newLocalWf, engineVer)
  File "/home/jmfernandez/projects/WfExS-backend/wfexs_backend/cwl_engine.py", line 579, in _enrichWorkflowDeps
    raise WorkflowEngineException(errstr)
wfexs_backend.engine.WorkflowEngineException: Could not get workflow dependencies running cwltool --print-deps from /home/jmfernandez/projects/WfExS-backend/workflow_examples/HUTCH/46/wfexs-backend-test_WorkDir/fc4796a7-f49e-439b-955a-abf337505cf6/workflow sec-hutchx86.cwl with /home/jmfernandez/projects/WfExS-backend/workflow_examples/HUTCH/46/wfexs-backend-test/CWLWorkflowEngine/3.1.20230601100705. Retval 1
======
STDOUT
======

======
STDERR
======
INFO /home/jmfernandez/projects/WfExS-backend/workflow_examples/HUTCH/46/wfexs-backend-test/CWLWorkflowEngine/3.1.20230601100705/bin/cwltool 3.1.20230601100705
INFO Resolved 'sec-hutchx86.cwl' to 'file:///home/jmfernandez/projects/WfExS-backend/workflow_examples/HUTCH/46/wfexs-backend-test_WorkDir/fc4796a7-f49e-439b-955a-abf337505cf6/workflow/sec-hutchx86.cwl'
ERROR Tool definition failed validation:
[Errno 2] No such file or directory: '/home/jmfernandez/projects/WfExS-backend/workflow_examples/HUTCH/46/wfexs-backend-test_WorkDir/fc4796a7-f49e-439b-955a-abf337505cf6/workflow/rquest-oneshotx86.cwl'

Indeed, the local workflow is depending on other cwl file. But you are telling WfExS that the workflow is only a single file, instead of giving it a "directory" and a starting point (i.e. a context). If you try something similar to the next:

# test-stage.yml
workflow_id: file:///root/hutch/workflows#subdirectory=sec-hutchx86.cwl
workflow_config:
  container: 'docker'
  secure: false
nickname: 'vas-workflow'
cacheDir: /tmp/wfexszn6siq2jtmpcache
crypt4gh:
  key: cosifer_test1_cwl.wfex.stage.key
  passphrase: mpel nite ified g
  pub: cosifer_test1_cwl.wfex.stage.pub
outputs:
  output_file:
    c-l-a-s-s: File
    glob: "output.json"
params:
  body:
    c-l-a-s-s: File
    url:
      - https://raw.githubusercontent.com/HDRUK/hutch/main/workflows/inputs/rquest-query.json
  is_availability: true
  db_host: "localhost"
  db_name: "hutch"
  db_user: "postgres"
  db_password: "example"

it should work (it's true, keyword subdirectory is a bit misleading, but it was named thinking on Nextflow scenarios).