aiidateam / aiida-core

The official repository for the AiiDA code
https://aiida-core.readthedocs.io
Other
434 stars 187 forks source link

Improvements to .molecule tests (Jenkins) #4718

Open chrisjsewell opened 3 years ago

chrisjsewell commented 3 years ago

A few things noted to improve at a later time, after merging #4565:

  1. remove create_docker.yml (https://github.com/aiidateam/aiida-core/pull/4565#discussion_r572159376_)

Well the key thing is that we need to be able to set the Docker context (see context: "../.." above in .molecule/default/config_local.yml), i.e. setting docker.image.path. I've re-opened in https://github.com/ansible-community/molecule-docker/pull/37, without the bug fixes (which maybe were not correct) to hopefully make it more easy to be accepted. I imagine it will take a while to make its way into a release though, so might have to open an issue to remove at a later date?

  1. dynamic python version selection (once aiida-prerequisites has environmental variable with the python version, https://github.com/aiidateam/aiida-core/pull/4565#discussion_r572052668)

  2. Fix pip cache (folder permissions, https://github.com/aiidateam/aiida-core/pull/4565#issuecomment-775146939)

chrisjsewell commented 3 years ago

Also for Jenkins it might be good to run reentry scan once in the preparation step (see https://github.com/aiidateam/aiida-core/pull/4719#issuecomment-775861310) (fixed with https://github.com/aiidateam/aiida-core/commit/3ad071244332cc78e4be597a05ad5703d17787bd)

giovannipizzi commented 3 years ago

Thanks Chris. Also, I get a new error here:

    TASK [run polish workchains] ***************************************************
    Tuesday 09 February 2021  11:53:41 +0000 (0:00:00.054)       0:00:25.080 ******
fatal: [molecule-aiida-django-d44d5143-8757-499e-bee6-a4952d2d9314]: FAILED! => changed=true
  cmd: |-
    set -e
    declare -a EXPRESSIONS=('1 -2 -1 4 -5 -5 * * * * +' '2 1 3 3 -1 + ^ ^ +' '3 -5 -1 -4 + * ^' '2 4 2 -4 * * +' '3 1 1 5 ^ ^ ^')
    for expression in "${EXPRESSIONS[@]}"; do
      /opt/conda/bin/verdi -p django run --auto-group -l polish -- "${HOME}/django/polish/cli.py" -X add! -C -F -d -t 600 "$expression"
    done
  delta: '0:12:15.919423'
  end: '2021-02-09 12:05:58.306043'
  msg: non-zero return code
  rc: 1
  start: '2021-02-09 11:53:42.386620'
  stderr: ''
  stderr_lines: <omitted>
  stdout: |-
    Expression: 1 -2 -1 4 -5 -5 * * * * +
    Evaluated : 201
    Workchain : uuid: 0df81fa3-34c3-4fc8-95c1-d8b2d66a9f10 (pk: 131) value: 201 <4>
    Success: the workchain accurately reproduced the evaluated value in 270.50s
    Expression: 2 1 3 3 -1 + ^ ^ +
    Evaluated : 10
    Workchain : uuid: 8bb1ec7f-24c5-4932-ae83-337d837fa471 (pk: 182) value: 10 <134>

    PLAY RECAP *********************************************************************
    Success: the workchain accurately reproduced the evaluated value in 115.19s
    Expression: 3 -5 -1 -4 + * ^
    Evaluated : 15625
    Workchain : uuid: e0393f2f-db68-4fe9-96bf-948ab0fc1820 (pk: 285) value: 15625 <185>
    Success: the workchain accurately reproduced the evaluated value in 138.40s
    Expression: 2 4 2 -4 * * +
    Evaluated : 999970
    Workchain : uuid: 8b95f070-b36f-405b-a83b-d2d29b45001c (pk: 319) value: 999970 <288>
    Success: the workchain accurately reproduced the evaluated value in 85.14s
    Expression: 3 1 1 5 ^ ^ ^
    Failed: the workchain<322> did not return a result output node
  stdout_lines: <omitted>
    molecule-aiida-django-d44d5143-8757-499e-bee6-a4952d2d9314 : ok=6    changed=4    unreachable=0    failed=1    skipped=1    rescued=0    ignored=1

    Playbook run took 0 days, 0 hours, 12 minutes, 41 seconds
    Tuesday 09 February 2021  12:05:58 +0000 (0:12:16.398)       0:12:41.479 ******
    ===============================================================================
    run polish workchains ------------------------------------------------- 736.40s
    Reset pythonpath of daemon (2 workers) ---------------------------------- 9.43s
    Copy workchain files ---------------------------------------------------- 6.19s
    verdi add code setup ---------------------------------------------------- 4.70s
    Check if add code is already present ------------------------------------ 4.12s
    get python path including workchains ------------------------------------ 0.45s
    set_fact ---------------------------------------------------------------- 0.10s
    include_tasks ----------------------------------------------------------- 0.05s
ERROR: 
An error occurred during the test sequence action: 'verify'. Cleaning up.

Do you know why?

chrisjsewell commented 3 years ago

No thats why I've merged #4729; seems to happen sometimes when Jenkins is under heavy load

chrisjsewell commented 3 years ago

There actually seems to be multiple reasons:

https://theossrv6.epfl.ch/jenkins/blue/organizations/jenkins/aiida_core_aiidateam/detail/PR-4729/1/pipeline

{'version': {'core': '1.5.2'}, 'exception': 'Traceback (most recent call last):\n  File "/opt/conda/lib/python3.7/site-packages/aiida/orm/utils/managers.py", line 83, in __getattr__\n    return self._get_node_by_link_label(label=name)\n  File "/opt/conda/lib/python3.7/site-packages/aiida/orm/utils/managers.py", line 64, in _get_node_by_link_label\n    return self._node.get_outgoing(link_type=self._link_type).get_node_by_label(label)\n  File "/opt/conda/lib/python3.7/site-packages/aiida/orm/utils/links.py", line 300, in get_node_by_label\n    raise exceptions.NotExistent(f\'no neighbor with the label {label} found\')\naiida.common.exceptions.NotExistent: no neighbor with the label result found\n\nDuring handling of the above exception, another exception occurred:\n\naiida.common.exceptions.NotExistentAttributeError: Node<143> does not have an output with link label \'result\'\n', 'checkpoints': '!plumpy:bundle\n\'!!meta\':\n  class_name: polish_workchains.polish_f7863ed90d43505883874975e9377e76:Polish00WorkChain\n  types:\n    _future: S\n  user:\n    object_loader: aiida.engine.persistence:ObjectLoader\nCONTEXT: !aiida_attributedict\n  calculations:\n  - !aiida_node \'b563bcc2-bbdf-4ab0-b423-925296b105ca\'\n  iterators: []\n  iterators_sign: []\n  iterators_stack: []\n  operands:\n  - 3\n  - 3\n  - -1\n  result: !aiida_node \'3b945e28-368d-4564-ab19-c70ce2f5fc57\'\n  workchains:\n  - !aiida_node \'155a41b0-7501-45f5-bcea-2f76d21297a9\'\nINPUTS_PARSED: "!plumpy:attributes_frozendict\\ncode: !aiida_node \'7cc094b0-55ef-4b66-b0e4-c233ba2026e1\'\\n\\\n  metadata: !plumpy:attributes_frozendict\\n  call_link_label: CALL\\n  store_provenance:\\\n  \\ true\\nmodulo: !aiida_node \'960b50ab-3288-4897-b67a-6df9bf683fc8\'\\noperands: !aiida_node\\\n  \\ \'4da6ca30-4030-492d-aaf5-c5d724b8f2a6\'\\n"\nINPUTS_RAW: \'!plumpy:attributes_frozendict\n\n  code: !aiida_node \'\'7cc094b0-55ef-4b66-b0e4-c233ba2026e1\'\'\n\n  modulo: !aiida_node \'\'960b50ab-3288-4897-b67a-6df9bf683fc8\'\'\n\n  operands: !aiida_node \'\'4da6ca30-4030-492d-aaf5-c5d724b8f2a6\'\'\n\n  \'\n_awaitables: []\n_creation_time: 1612875513.0140836\n_enable_persistence: true\n_future:\n  \'!!meta\':\n    class_name: plumpy.persistence:SavableFuture\n  _result: null\n  _state: PENDING\n_parent_pid: null\n_paused: null\n_pid: 134\n_pre_paused_status: null\n_state:\n  \'!!meta\':\n    class_name: plumpy.process_states:Running\n  args: !!python/tuple []\n  in_state: true\n  kwargs: {}\n  run_fn: _do_step\n_status: null\ncalc_id: 134\nstepper_state:\n  \'!!meta\':\n    class_name: plumpy.workchains:_BlockStepper\n  _pos: 3\n  stepper_state:\n    \'!!meta\':\n      class_name: plumpy.workchains:_FunctionStepper\n    _fn: post_raise_power\n', 'process_label': 'Polish00WorkChain', 'process_state': 'excepted', 'stepper_state_info': '3:post_raise_power'}

https://theossrv6.epfl.ch/jenkins/blue/organizations/jenkins/aiida_core_aiidateam/detail/develop/996/pipeline

{'sealed': True, 'version': {'core': '1.5.2'}, 'exception': 'concurrent.futures._base.TimeoutError\n', 'process_label': 'Polish00WorkChain', 'process_state': 'excepted', 'process_status': 'Waiting for child processes: 325, 326', 'stepper_state_info': '1:raise_power'}

In #4733 I am going to add a retry for the workchain executions, to see if that will mitigate the failures, but obviously in time I/we should look into these more closely.

sphuber commented 2 years ago

@chrisjsewell since Jenkins has been decommissioned, can we close this? Or was this rather molecule specific. That folder is still present in the source tree on develop. Should that be removed, or is that still being used?

sphuber commented 2 years ago

Pinging @chrisjsewell . Are we still using the tests in .molecule? Is it even up to date?