aiidateam / aiida-core

The official repository for the AiiDA code
https://aiida-core.readthedocs.io
Other
434 stars 187 forks source link

Excepted Workchain is not sealed #6548

Closed kmlefran closed 2 months ago

kmlefran commented 3 months ago

Describe the bug

A Workchain was excepted, but the node was not sealed.

# excepted WorkChainNode in my database
node = orm.load_node(114060)
node.is_sealed
# Returns False, should return True

Steps to reproduce

Steps to reproduce the behavior:

  1. Install aiida-aimall 0.7.4 via pip
  2. Find the relevant files at a repository hosting the Python code
  3. Put minimal_workchains.py in the Python path,
  4. Adjust the import of DiffuseImplicitHybridWorkchain in the notebook errortest.ipynb to match the python file, and the Gaussian code input to match your setup Gaussian Code
  5. Execute the cells in errortest.ipynb
  6. Wait
  7. Workchain excepts in the last step due to a code error in this version, resulting in an unsealed WorkChain

Expected behavior

The excepted Workchain should be sealed, but is not

Your environment

Additional context

Coworker's code, just submitting the issue as I was able to reproduce an excepted unsealed Workchain as mentioned in the (AiiDA discourse group )[https://aiida.discourse.group/t/how-to-reset-fix-fromgroupsubmissioncontroller/426/7?u=kmlefran]

sphuber commented 3 months ago

Thanks for the detailed report. Could you please report the output of verdi process report and verdi node attributes for the excepted workchain?

kmlefran commented 3 months ago

verdi process report

2024-07-16 08:42:53 [10052 | REPORT]:   [114069|GaussianBaseRestartWorkChain|run_process]: launching GaussianWFXCalculation<114076> iteration #1
2024-07-16 08:42:53 [10053 | REPORT]:   [114072|GaussianBaseRestartWorkChain|run_process]: launching GaussianWFXCalculation<114077> iteration #1
2024-07-16 08:42:54 [10054 | REPORT]:   [114075|GaussianBaseRestartWorkChain|run_process]: launching GaussianWFXCalculation<114078> iteration #1
2024-07-20 22:31:13 [10096 | REPORT]:   [114072|GaussianBaseRestartWorkChain|results]: The work chain completed after 1 iterations
2024-07-20 22:31:13 [10097 | REPORT]:   [114072|GaussianBaseRestartWorkChain|on_terminated]: remote folders will not be cleaned
2024-07-21 13:26:29 [10099 | REPORT]:   [114069|GaussianBaseRestartWorkChain|results]: The work chain completed after 1 iterations
2024-07-21 13:26:29 [10100 | REPORT]:   [114069|GaussianBaseRestartWorkChain|on_terminated]: remote folders will not be cleaned
2024-07-21 16:14:37 [10102 | REPORT]:   [114075|GaussianBaseRestartWorkChain|results]: The work chain completed after 1 iterations
2024-07-21 16:14:37 [10103 | REPORT]:   [114075|GaussianBaseRestartWorkChain|on_terminated]: remote folders will not be cleaned
2024-07-21 16:14:39 [10104 | REPORT]: [114060|DiffuseImplicitHybridWorkChain|on_except]: Traceback (most recent call last):
  File "/Users/chemlab/anaconda3/envs/errortest/lib/python3.12/site-packages/plumpy/base/state_machine.py", line 324, in transition_to
    self._enter_next_state(new_state)
  File "/Users/chemlab/anaconda3/envs/errortest/lib/python3.12/site-packages/plumpy/base/state_machine.py", line 388, in _enter_next_state
    self._fire_state_event(StateEventHook.ENTERED_STATE, last_state)
  File "/Users/chemlab/anaconda3/envs/errortest/lib/python3.12/site-packages/plumpy/base/state_machine.py", line 300, in _fire_state_event
    callback(self, hook, state)
  File "/Users/chemlab/anaconda3/envs/errortest/lib/python3.12/site-packages/plumpy/processes.py", line 337, in <lambda>
    lambda _s, _h, from_state: self.on_entered(cast(Optional[process_states.State], from_state)),
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chemlab/anaconda3/envs/errortest/lib/python3.12/site-packages/aiida/engine/processes/process.py", line 426, in on_entered
    self.update_outputs()
  File "/Users/chemlab/anaconda3/envs/errortest/lib/python3.12/site-packages/aiida/engine/processes/process.py", line 665, in update_outputs
    output.base.links.add_incoming(self.node, LinkType.RETURN, link_label)
  File "/Users/chemlab/anaconda3/envs/errortest/lib/python3.12/site-packages/aiida/orm/nodes/links.py", line 58, in add_incoming
    source.base.links.validate_outgoing(self._node, link_type, link_label)
  File "/Users/chemlab/anaconda3/envs/errortest/lib/python3.12/site-packages/aiida/orm/nodes/process/workflow/workflow.py", line 42, in validate_outgoing
    raise ValueError(
ValueError: Workflow<DiffuseImplicitHybridWorkChain> tried returning an unstored `Data` node. This likely means new `Data` is being created inside the workflow. In order to preserve data provenance, use a `calcfunction` to create this node and return its output from the workflow

verdi node attributes

{
    "checkpoints": "!plumpy:bundle\n'!!meta':\n  class_name: arhender_aiida_workchains.workchains:DiffuseImplicitHybridWorkChain\n  types:\n    _event_helper: S\n    _future: S\n  user:\n    object_loader: aiida.engine.persistence:ObjectLoader\nCONTEXT: !aiida_attributedict\n  g16: !aiida_attributedict\n    diffuse: !aiida_node 'd4e32c0d-cf52-432b-baa8-48526569bc0d'\n    hybrid: !aiida_node '34b4f8c8-ed33-47c4-a128-70e393a03c2f'\n    implicit: !aiida_node '82d3ab34-838c-47ea-bd96-72a96e4aff1c'\nINPUTS_PARSED: \"!plumpy:attributes_frozendict\\ndiffuse_basis_set_data: !aiida_node\\\n  \\ 'a6f2fb7b-1dbe-49d6-a4c6-cb3a4ebb0189'\\ng16_base_params: !aiida_node '7242f097-6ce3-46f5-8168-a4408d66e62b'\\n\\\n  g16_code: !aiida_node '058d0669-62a4-4c23-928e-88f127d8751d'\\ninput_structure: !aiida_node\\\n  \\ 'c8c9730c-0c2d-4aa9-ab14-c73cefa1524c'\\nmetadata: !plumpy:attributes_frozendict\\n\\\n  \\  call_link_label: CALL\\n  store_provenance: true\\nmolecule_name: !aiida_node '1250a691-bf46-445c-a28d-97cecb582480'\\n\\\n  original_parsed_data: !aiida_node '8b948994-f2a0-404f-8ec7-fa4591ade094'\\nsolvent:\\\n  \\ !aiida_node 'ddf77f5e-b829-4730-8a1c-89edb505cc4d'\\n\"\nINPUTS_RAW: '!plumpy:attributes_frozendict\n\n  diffuse_basis_set_data: !aiida_node ''a6f2fb7b-1dbe-49d6-a4c6-cb3a4ebb0189''\n\n  g16_base_params: !aiida_node ''7242f097-6ce3-46f5-8168-a4408d66e62b''\n\n  g16_code: !aiida_node ''058d0669-62a4-4c23-928e-88f127d8751d''\n\n  input_structure: !aiida_node ''c8c9730c-0c2d-4aa9-ab14-c73cefa1524c''\n\n  molecule_name: !aiida_node ''1250a691-bf46-445c-a28d-97cecb582480''\n\n  original_parsed_data: !aiida_node ''8b948994-f2a0-404f-8ec7-fa4591ade094''\n\n  solvent: !aiida_node ''ddf77f5e-b829-4730-8a1c-89edb505cc4d''\n\n  '\n_awaitables: []\n_creation_time: 1721133771.961823\n_enable_persistence: true\n_event_helper:\n  '!!meta':\n    class_name: plumpy.event_helper:EventHelper\n  _listener_type: !!python/name:plumpy.process_listener.ProcessListener ''\n  _listeners: !!set {}\n_future:\n  '!!meta':\n    class_name: plumpy.persistence:SavableFuture\n  _result: null\n  _state: PENDING\n_parent_pid: null\n_paused: null\n_pid: 114060\n_pre_paused_status: null\n_state:\n  '!!meta':\n    class_name: plumpy.process_states:Running\n  args: !!python/tuple []\n  in_state: true\n  kwargs: {}\n  run_fn: _do_step\n_status: null\ncalc_id: 114060\nstepper_state:\n  '!!meta':\n    class_name: plumpy.workchains:_BlockStepper\n  _pos: 1\n  stepper_state:\n    '!!meta':\n      class_name: plumpy.workchains:_FunctionStepper\n    _fn: result\n",
    "exception": "ValueError: Workflow<DiffuseImplicitHybridWorkChain> tried returning an unstored `Data` node. This likely means new `Data` is being created inside the workflow. In order to preserve data provenance, use a `calcfunction` to create this node and return its output from the workflow",
    "exit_status": 0,
    "metadata_inputs": null,
    "process_label": "DiffuseImplicitHybridWorkChain",
    "process_state": "excepted",
    "stepper_state_info": "1:result",
    "version": {
        "core": "2.6.1",
        "plugin": "0.1.13"
    }
}
sphuber commented 3 months ago

Thanks @kmlefran that is really useful. I see what the problem is now. The source of the problem comes from your DiffuseImplicitHybridWorkChain. It creates new output nodes in the result step and that is losing provenance so the code excepts: https://github.com/kmlefran/aiida-unsealed-workchain/blob/d95ce4e0c06f47b7b1d0cfd25b3e13bf637fd2c3/minimal_workchains.py#L256

The quick workaround would be to manually store the Dict node before passing it to self.out, but this is not recommended:

        self.out('free_energy', Dict(energy_dict).store())

Rather, you will want to create those nodes with a calcfunction so provenance is preserved. See the documentation for more details.

Still, there is also a bug in aiida-core where the workchain does not get sealed. I will see if I can make a minimal working example to reproduce it and see if I can submit a fix.

kmlefran commented 3 months ago

Thanks @kmlefran that is really useful. I see what the problem is now. The source of the problem comes from your DiffuseImplicitHybridWorkChain. It creates new output nodes in the result step and that is losing provenance so the code excepts: https://github.com/kmlefran/aiida-unsealed-workchain/blob/d95ce4e0c06f47b7b1d0cfd25b3e13bf637fd2c3/minimal_workchains.py#L256

The quick workaround would be to manually store the Dict node before passing it to self.out, but this is not recommended:

        self.out('free_energy', Dict(energy_dict).store())

Rather, you will want to create those nodes with a calcfunction so provenance is preserved. See the documentation for more details.

Still, there is also a bug in aiida-core where the workchain does not get sealed. I will see if I can make a minimal working example to reproduce it and see if I can submit a fix.

Thanks for the information! I believe the bug is old and has since been fixed, I was just using this version of my coworker's code as a case where I could reproduce the node not being sealed.