aiidateam / aiida-core

The official repository for the AiiDA code
https://aiida-core.readthedocs.io
Other
436 stars 190 forks source link

"maximum recursion depth reached" when submitting ~200 workchains #4876

Closed ltalirz closed 1 year ago

ltalirz commented 3 years ago

@danieleongari reports the following error when submitting ~200 workchains with aiida-core 1.6.1

2021-04-27 16:41:13 [26387 | ERROR]: Traceback (most recent call last):
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/manage/external/rmq.py", line 208, in _continue
    result = await super()._continue(communicator, pid, nowait, tag)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/plumpy/process_comms.py", line 607, in _continue
    proc = cast('Process', saved_state.unbundle(self._load_context))
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/plumpy/persistence.py", line 60, in unbundle
    return Savable.load(self, load_context)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/plumpy/persistence.py", line 452, in load
    return load_cls.recreate_from(saved_state, load_context)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/plumpy/processes.py", line 238, in recreate_from
    process = cast(Process, super().recreate_from(saved_state, load_context))
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/plumpy/persistence.py", line 477, in recreate_from
    call_with_super_check(obj.load_instance_state, saved_state, load_context)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/engine/processes/workchains/workchain.py", line 105, in load_instance_state
    super().load_instance_state(saved_state, load_context)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/engine/processes/process.py", line 284, in load_instance_state
    super().load_instance_state(saved_state, load_context)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/plumpy/processes.py", line 620, in load_instance_state
    decoded = self.decode_input_args(saved_state[BundleKeys.INPUTS_RAW])
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/engine/processes/process.py", line 607, in decode_input_args
    return serialize.deserialize(encoded)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 230, in deserialize
    return yaml.load(serialized, Loader=AiiDALoader)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 43, in get_single_data
    return self.construct_document(node)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 47, in construct_document
    data = self.construct_object(node)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 92, in construct_object
    data = constructor(self, node)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 131, in mapping_constructor
    yaml_node = loader.construct_mapping(mapping, deep=True)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 210, in construct_mapping
    return super().construct_mapping(node, deep=deep)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 135, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 99, in construct_object
    for dummy in generator:
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 404, in construct_yaml_map
    value = self.construct_mapping(node)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 210, in construct_mapping
    return super().construct_mapping(node, deep=deep)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 135, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/yaml/constructor.py", line 92, in construct_object
    data = constructor(self, node)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 56, in node_constructor
    return orm.load_node(uuid=yaml_node)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/utils/__init__.py", line 197, in load_node
    return load_entity(
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/utils/__init__.py", line 77, in load_entity
    return entity_loader.load_entity(
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/utils/loaders.py", line 213, in load_entity
    entity = builder.one()[0]
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/querybuilder.py", line 2179, in one
    res = self.all()
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/querybuilder.py", line 2252, in all
    matches = list(self.iterall(batch_size=batch_size))
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/querybuilder.py", line 2209, in iterall
    query = self.get_query()
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/querybuilder.py", line 2088, in get_query
    query = self._build()
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/querybuilder.py", line 1938, in _build
    self._query = self._query.filter(self._build_filters(alias, filter_specs))
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/querybuilder.py", line 1373, in _build_filters
    self._impl.get_filter_expr(
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/implementation/django/querybuilder.py", line 212, in get_filter_expr
    self.get_filter_expr(
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/implementation/django/querybuilder.py", line 237, in get_filter_expr
    expr = self.get_filter_expr_from_column(operator, value, column)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/aiida/orm/implementation/querybuilder.py", line 217, in get_filter_expr_from_column
    expr = database_entity.cast(String).like(value)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/orm/attributes.py", line 236, in __getattr__
    return getattr(self.comparator, key)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 987, in __getattr__
    return self._fallback_getattr(key)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/orm/properties.py", line 364, in _fallback_getattr
    return getattr(self.__clause_element__(), key)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 974, in oneshot
    result = fn(*args, **kw)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/orm/properties.py", line 316, in _memoized_method___clause_element__
    return self.adapter(self.prop.columns[0])
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/orm/util.py", line 680, in _adapt_element
    return self._adapter.traverse(elem)._annotate(
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/util.py", line 936, in traverse
    return self.columns[obj]
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/util/_collections.py", line 745, in __missing__
    self[key] = val = self.creator(self.weakself(), key)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/util.py", line 943, in _locate_col
    c = ClauseAdapter.traverse(self, col)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/visitors.py", line 240, in traverse
    return replacement_traverse(obj, self.__traverse_options__, replace)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/visitors.py", line 484, in replacement_traverse
    obj = clone(obj, **opts)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/visitors.py", line 473, in clone
    newelem = replace(elem)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/visitors.py", line 236, in replace
    e = v.replace(elem)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/util.py", line 848, in replace
    return self._corresponding_column(col, True)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/util.py", line 820, in _corresponding_column
    newcol = self.selectable.corresponding_column(
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/selectable.py", line 560, in corresponding_column
    if self.c.contains_column(column):
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 893, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/selectable.py", line 647, in columns
    self._populate_column_collection()
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/selectable.py", line 1393, in _populate_column_collection
    col._make_proxy(self)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 1802, in _make_proxy
    c = self._constructor(
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 1568, in __init__
    self._init_items(*args)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 121, in _init_items
    spwd(self)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/base.py", line 461, in _set_parent_with_dispatch
    self._set_parent(parent)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 2282, in _set_parent
    self.parent._on_table_attach(self._set_table)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 1722, in _on_table_attach
    event.listen(self, "after_parent_attach", fn)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/event/api.py", line 102, in listen
    _event_key(target, identifier, fn).listen(*args, **kw)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/event/api.py", line 25, in _event_key
    tgt = evt_cls._accept_with(target)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/event/base.py", line 232, in _accept_with
    if hasattr(target, "dispatch"):
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/event/base.py", line 298, in __get__
    obj.__dict__["dispatch"] = disp = self.dispatch._for_instance(obj)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/event/base.py", line 121, in _for_instance
    return self._for_class(instance_cls)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/event/base.py", line 117, in _for_class
    return self.__class__(self, instance_cls)
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/site-packages/sqlalchemy/event/base.py", line 83, in __init__
    self._empty_listeners = self._empty_listener_reg[instance_cls]
  File "/home/daniele/anaconda3/envs/aiida_py38/lib/python3.8/weakref.py", line 383, in __getitem__
    return self.data[ref(key)]
RecursionError: maximum recursion depth exceeded while calling a Python object

the error goes away when adding a time.sleep(2) in between submissions

Your environment

sphuber commented 3 years ago

Does this come from the submission script or the daemon? Looks to be on the daemon side, correct?

ltalirz commented 3 years ago

Sorry, forgot to mention: it comes from the submission script, not from the daemon logs

chrisjsewell commented 3 years ago

It may also be helpful to provide the output of pip freeze, I.e to check the version of sqlalchemy

sphuber commented 3 years ago

Sorry, forgot to mention: it comes from the submission script

Are you sure? Is he running the workchains in that script or sending them to the daemon? Reason that I am asking is that the stack trace shows that the problem originates from the ProcessLauncher._continue_ call which should not be called during submission. This is the hook that is called on a daemon runner when it receives a task to continue a process from RabbitMQ.

danieleongari commented 3 years ago

@sphuber The error comes from inspecting the report (verdi process report). The workchain is submitted from a Jupyter notebook using the builder, and then submitting it. Of the ca. 200 calculations, the first 10ish run ok, and then the most of them start to be excepted with the error that Leo reported: it looks like there is a flooding of processes that causes problems.

@chrisjsewell is this enough?

~ pip freeze | egrep sql
SQLAlchemy @ file:///home/conda/feedstock_root/build_artifacts/sqlalchemy_1612225077951/work
SQLAlchemy-Utils @ file:///home/conda/feedstock_root/build_artifacts/sqlalchemy-utils_1614043858099/work
sqlparse @ file:///home/conda/feedstock_root/build_artifacts/sqlparse_1602142927465/work

Thank you for the help!

sphuber commented 3 years ago

Thanks for the additional info @danieleongari . It would be very useful if we actually could get the version of sqlalchemy. Maybe you can do the following in a shell or notebook:

In [1]: import sqlalchemy

In [2]: sqlalchemy.__version__
Out[2]: '1.3.23'

So as expected, the problem is on the daemon worker side and not the submit script. What is happening is that the daemon worker receives a task from the process queue from RabbitMQ, it loads the corresponding node from the database and then uses the YAML dump in the checkpoint attribute, to reconstruct the Process instance in memory from that serialized version. This includes the entire set of inputs of the process that were also serialized and so each of those are reloaded from the database, which happens in the line:

return orm.load_node(uuid=yaml_node)

from the aiida.orm.utils.serialize module. The load_node call will use the QueryBuilder underneath which will then go into sqlalchemy which has a big part in the stack trace. This is where I lose the thread, because apparently somewhere in the sqlalchemy code that is invoked when we call load_node leads to this infinite recursion. I have no idea why or why this should be related to many processes being run.

The only hint there is is that the final method in the stacktrace comes from the weakref built in module. At this point I have to start speculating but it could be that the daemon worker is trying to load a node that is already loaded in its memory (as part of the inputs of another process that it is working on, they could share the same Code as input for example) and there might be some caching mechanism in place that when the same node gets loaded, it is coupled to the existing ref instead. I am really spitballing here and have no idea how to further debug this or try and reproduce.

ltalirz commented 3 years ago
In [1]: import sqlalchemy

In [2]: sqlalchemy.__version__
Out[2]: '1.3.23'

sorry for the hickkup with the submission script vs daemon - there was a miscommunication on our side

mbercx commented 3 years ago

I'm having a somewhat similar issue, so I'll add my case to this one. When submitting a 100 PwBaseWorkChains in a loop (without any pause in between šŸ˜¬ ), half of them got stuck in the "Running" state:

$ verdi process list -S running
   PK  Created    Process label    Process State    Process status
-----  ---------  ---------------  ---------------  ----------------
36774  1h ago     PwBaseWorkChain  āµ Running
36783  1h ago     PwBaseWorkChain  āµ Running
...
37166  1h ago     PwBaseWorkChain  āµ Running

Total results: 49

(Note that I've already deleted and restarted one manually). The logs for these are completely empty:

$ verdi process report 37156
No log messages recorded for this entry

But a bit of digging through the daemon logs gives me the following trace:

05/03/2021 08:52:17 PM <16823> kiwipy.rmq.tasks: [ERROR] Exception occurred while processing task.
Traceback (most recent call last):
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/plumpy/utils.py", line 128, in __getattr__
    return self[attr]
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/plumpy/utils.py", line 85, in __getitem__
    return self._dict[key]
KeyError: 'kpoints'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mbercx/envs/aiida-sirius/code/aiida-quantumespresso/aiida_quantumespresso/workflows/pw/base.py", line 247, in validate_kpoints
    kpoints = self.inputs.kpoints
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/plumpy/utils.py", line 131, in __getattr__
    raise AttributeError(errmsg)
AttributeError: 'AttributesFrozendict' object has no attribute 'kpoints'

These two are basically repeated until a RecursionError is raised:

Traceback (most recent call last):
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/kiwipy/rmq/tasks.py", line 166, in _on_task
    result = await result
  File "/usr/lib/python3.8/asyncio/futures.py", line 257, in __await__
    yield self  # This tells Task to wait for completion.
  File "/usr/lib/python3.8/asyncio/tasks.py", line 349, in __wakeup
    future.result()
  File "/usr/lib/python3.8/asyncio/futures.py", line 175, in result
    raise self._exception
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/kiwipy/rmq/threadcomms.py", line 253, in done
    result = kiwi_future.result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/kiwipy/futures.py", line 54, in capture_exceptions
    yield
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/plumpy/communications.py", line 48, in on_done
    result = plum_future.result()
  File "/usr/lib/python3.8/asyncio/futures.py", line 175, in result
    raise self._exception
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/kiwipy/futures.py", line 54, in capture_exceptions
    yield
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/plumpy/futures.py", line 73, in run_task
    res = await coro()
  File "/home/mbercx/.virtualenvs/aiida-sirius/lib/python3.8/site-packages/plumpy/process_comms.py", line 539, in __call__
    return await self._continue(communicator, **task.get(TASK_ARGS, {}))
  File "/home/mbercx/envs/aiida-sirius/code/aiida-core/aiida/manage/external/rmq.py", line 219, in _continue
    self.handle_continue_exception(node, exception, message)
  File "/home/mbercx/envs/aiida-sirius/code/aiida-core/aiida/manage/external/rmq.py", line 158, in handle_continue_exception
    node.logger.exception(message)
  File "/usr/lib/python3.8/logging/__init__.py", line 1814, in exception
    self.log(ERROR, msg, *args, exc_info=exc_info, **kwargs)
  File "/usr/lib/python3.8/logging/__init__.py", line 1829, in log
    self.logger.log(level, msg, *args, **kwargs)
  File "/usr/lib/python3.8/logging/__init__.py", line 1500, in log
    self._log(level, msg, args, **kwargs)
  File "/usr/lib/python3.8/logging/__init__.py", line 1577, in _log
    self.handle(record)
  File "/usr/lib/python3.8/logging/__init__.py", line 1587, in handle
    self.callHandlers(record)
  File "/usr/lib/python3.8/logging/__init__.py", line 1649, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python3.8/logging/__init__.py", line 950, in handle
    self.emit(record)
  File "/usr/lib/python3.8/logging/__init__.py", line 1081, in emit
    msg = self.format(record)
  File "/usr/lib/python3.8/logging/__init__.py", line 925, in format
    return fmt.format(record)
  File "/usr/lib/python3.8/logging/__init__.py", line 672, in format
    record.exc_text = self.formatException(record.exc_info)
  File "/usr/lib/python3.8/logging/__init__.py", line 622, in formatException
    traceback.print_exception(ei[0], ei[1], tb, None, sio)
  File "/usr/lib/python3.8/traceback.py", line 103, in print_exception
    for line in TracebackException(
  File "/usr/lib/python3.8/traceback.py", line 493, in __init__
    context = TracebackException(
  File "/usr/lib/python3.8/traceback.py", line 493, in __init__
    context = TracebackException(
  File "/usr/lib/python3.8/traceback.py", line 493, in __init__
    context = TracebackException(
  [Previous line repeated 34 more times]
  File "/usr/lib/python3.8/traceback.py", line 476, in __init__
    _seen.add(id(exc_value))
RecursionError: maximum recursion depth exceeded while calling a Python object

After deleting one and restarting it, it ran just fine. I've also started ~35 work chains like this just fine. I then tried deleting the remaining 49, and restarting them all at once, and then I wind up with 7 PwBaseWorkChains that are stuck in Running, with the same error trace as above, after which is a whole range of reports of launching the PwCalculations:

RecursionError: maximum recursion depth exceeded while calling a Python object
05/03/2021 10:28:13 PM <17394> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [37947|PwBaseWorkChain|run_process]: launching PwCalculation<37961> iteration #1
05/03/2021 10:28:14 PM <17394> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [37849|PwBaseWorkChain|run_process]: launching PwCalculation<37964> iteration #1
...

Afterwards, I deleted all 100 work chains, and then restarted them with a 5 second pause in between. Now all work chains were able to create the PwCalculations without issue.

I'm running:

OS: Ubuntu 18.04.5 LTS Python: 3.8 aiida-core: 1.6.3 (but I think I've also spotted this issue when I was running v1.6.1, just couldn't pin it down at the time) plumpy: 0.19.0 kiwipy: 0.7.4 sqlalchemy: 1.3.23

dev-zero commented 3 years ago

Seeing the same thing, also when submitting a large amount of workchains (whether one-by-one in a script or by a parent workchain launching them does not matter, but when launching them via a parent workchain the issue is triggered sooner).

The backtrace is a bit different, though:

2021-05-14 10:55:09 [1739 | REPORT]: [6047|Cp2kEosWorkChain|on_except]: Traceback (most recent call last):
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/process_states.py", line 230, in execute
    result = self.run_fn(*self.args, **self.kwargs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/workchains/workchain.py", line 214, in _do_step
    finished, stepper_result = self._stepper.step()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/workchains.py", line 299, in step
    finished, result = self._child_stepper.step()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/workchains.py", line 250, in step
    return True, self._fn(self._workchain)
  File "/scratch/tiziano/work/aiida/aiida-cp2k/aiida_cp2k/workchains/eos.py", line 170, in run_calculations
    builder.cp2k.structure = get_rescaled_structure(self.inputs.structure, Float(scale))
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/functions.py", line 179, in decorated_function
    result, _ = run_get_node(*args, **kwargs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/functions.py", line 131, in run_get_node
    process = process_class(inputs=inputs, runner=runner)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 193, in __call__
    inst.transition_to(inst.create_initial_state())
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 335, in transition_to
    self.transition_failed(initial_state_label, label, *sys.exc_info()[1:])
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 351, in transition_failed
    raise exception.with_traceback(trace)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 320, in transition_to
    self._enter_next_state(new_state)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 382, in _enter_next_state
    self._fire_state_event(StateEventHook.ENTERING_STATE, next_state)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 299, in _fire_state_event
    callback(self, hook, state)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/processes.py", line 324, in <lambda>
    lambda _s, _h, state: self.on_entering(cast(process_states.State, state)),
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 380, in on_entering
    super().on_entering(state)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/processes.py", line 669, in on_entering
    call_with_super_check(self.on_create)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 376, in on_create
    self._pid = self._create_and_setup_db_record()  # pylint: disable=attribute-defined-outside-init
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 563, in _create_and_setup_db_record
    self._setup_db_record()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/functions.py", line 362, in _setup_db_record
    super()._setup_db_record()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 672, in _setup_db_record
    self._setup_inputs()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 709, in _setup_inputs
    self.node.add_incoming(node, LinkType.INPUT_CALC, name)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/nodes/node.py", line 417, in add_incoming
    self.validate_incoming(source, link_type, link_label)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/nodes/process/process.py", line 472, in validate_incoming
    super().validate_incoming(source, link_type, link_label)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/utils/mixins.py", line 148, in validate_incoming
    super().validate_incoming(source, link_type=link_type, link_label=link_label)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/nodes/node.py", line 449, in validate_incoming
    if builder.count() > 0:
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/querybuilder.py", line 2164, in count
    return self._impl.count(query)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/implementation/querybuilder.py", line 288, in count
    return query.count()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3803, in count
    return self.from_self(col).scalar()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3523, in scalar
    ret = self.one()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3490, in one
    ret = self.one_or_none()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3459, in one_or_none
    ret = list(self)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1115, in _execute_clauseelement
    compiled_sql = elem.compile(
  File "<string>", line 1, in <lambda>
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 481, in compile
    return self._compiler(dialect, bind=bind, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 487, in _compiler
    return dialect.statement_compiler(dialect, self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 592, in __init__
    Compiled.__init__(self, dialect, statement, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 322, in __init__
    self.string = self.process(self.statement, **compile_kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 352, in process
    return obj._compiler_dispatch(self, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2292, in _compose_select_body
    [
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2293, in <listcomp>
    f._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1801, in visit_alias
    ret = alias.original._compiler_dispatch(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2292, in _compose_select_body
    [
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2293, in <listcomp>
    f._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2435, in visit_join
    join.left._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2437, in visit_join
    + join.right._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1727, in visit_cte
    self.visit_cte(pre_alias_cte, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1758, in visit_cte
    cte.original._compiler_dispatch(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1216, in visit_compound_select
    text = (" " + keyword + " ").join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1218, in <genexpr>
    c._compiler_dispatch(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2301, in _compose_select_body
    t = select._whereclause._compiler_dispatch(self, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1040, in visit_clauselist
    text = sep.join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1040, in <genexpr>
    text = sep.join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1040, in visit_clauselist
    text = sep.join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1040, in <genexpr>
    text = sep.join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1354, in visit_binary
    return disp(binary, operator_, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1471, in visit_like_op_binary
    binary.left._compiler_dispatch(self, **kw),
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1075, in visit_cast
    cast.clause._compiler_dispatch(self, **kwargs),
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/annotation.py", line 79, in _compiler_dispatch
    return self.__element.__class__._compiler_dispatch(self, visitor, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 912, in visit_column
    if not is_literal and isinstance(name, elements._truncated_label):
RecursionError: maximum recursion depth exceeded while calling a Python object

aiida-core: 403f7e7d896f8408d0cacee9fe41c030a1072eaf plumpy: 0.19.0 kiwipy: 0.7.4 sqlalchemy: 1.3.24 python: 3.9.5

zhubonan commented 3 years ago

Just want to say that I did a similar thing like submitting the work chains (~100 VaspRelaxWorkChain) in a loop without delay previously (aiida-core < 1.6.1), and found no issue. In my case, the submission process itself is rather slow though, takes about 1.5 seconds for each call to submit.

dev-zero commented 3 years ago

And another one, again at a different place (but this one comes from the sub-workchain called by the primary one):

$ verdi process report 15370
2021-05-17 10:26:29 [7359 | REPORT]: [15370|Cp2kBaseWorkChain|on_except]: Traceback (most recent call last):
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/process_states.py", line 230, in execute
    result = self.run_fn(*self.args, **self.kwargs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/workchains/workchain.py", line 214, in _do_step
    finished, stepper_result = self._stepper.step()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/workchains.py", line 299, in step
    finished, result = self._child_stepper.step()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/workchains.py", line 532, in step
    finished, result = self._child_stepper.step()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/workchains.py", line 299, in step
    finished, result = self._child_stepper.step()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/workchains.py", line 250, in step
    return True, self._fn(self._workchain)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/workchains/restart.py", line 183, in run_process
    node = self.submit(self.process_class, **inputs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 498, in submit
    return self.runner.submit(process, *args, **kwargs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/runners.py", line 184, in submit
    process_inited = self.instantiate_process(process, *args, **inputs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/runners.py", line 170, in instantiate_process
    return instantiate_process(self, process, *args, **inputs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 65, in instantiate_process
    process = process_class(runner=runner, inputs=inputs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 193, in __call__
    inst.transition_to(inst.create_initial_state())
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 335, in transition_to
    self.transition_failed(initial_state_label, label, *sys.exc_info()[1:])
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 351, in transition_failed
    raise exception.with_traceback(trace)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 320, in transition_to
    self._enter_next_state(new_state)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 382, in _enter_next_state
    self._fire_state_event(StateEventHook.ENTERING_STATE, next_state)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 299, in _fire_state_event
    callback(self, hook, state)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/processes.py", line 324, in <lambda>
    lambda _s, _h, state: self.on_entering(cast(process_states.State, state)),
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 380, in on_entering
    super().on_entering(state)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/processes.py", line 669, in on_entering
    call_with_super_check(self.on_create)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 376, in on_create
    self._pid = self._create_and_setup_db_record()  # pylint: disable=attribute-defined-outside-init
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 563, in _create_and_setup_db_record
    self._setup_db_record()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 672, in _setup_db_record
    self._setup_inputs()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/process.py", line 709, in _setup_inputs
    self.node.add_incoming(node, LinkType.INPUT_CALC, name)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/nodes/node.py", line 417, in add_incoming
    self.validate_incoming(source, link_type, link_label)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/nodes/process/process.py", line 472, in validate_incoming
    super().validate_incoming(source, link_type, link_label)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/utils/mixins.py", line 148, in validate_incoming
    super().validate_incoming(source, link_type=link_type, link_label=link_label)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/nodes/node.py", line 449, in validate_incoming
    if builder.count() > 0:
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/querybuilder.py", line 2164, in count
    return self._impl.count(query)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/orm/implementation/querybuilder.py", line 288, in count
    return query.count()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3803, in count
    return self.from_self(col).scalar()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3523, in scalar
    ret = self.one()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3490, in one
    ret = self.one_or_none()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3459, in one_or_none
    ret = list(self)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1115, in _execute_clauseelement
    compiled_sql = elem.compile(
  File "<string>", line 1, in <lambda>
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 481, in compile
    return self._compiler(dialect, bind=bind, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 487, in _compiler
    return dialect.statement_compiler(dialect, self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 592, in __init__
    Compiled.__init__(self, dialect, statement, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 322, in __init__
    self.string = self.process(self.statement, **compile_kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 352, in process
    return obj._compiler_dispatch(self, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2292, in _compose_select_body
    [
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2293, in <listcomp>
    f._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1801, in visit_alias
    ret = alias.original._compiler_dispatch(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2292, in _compose_select_body
    [
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2293, in <listcomp>
    f._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2435, in visit_join
    join.left._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2437, in visit_join
    + join.right._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1727, in visit_cte
    self.visit_cte(pre_alias_cte, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1758, in visit_cte
    cte.original._compiler_dispatch(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1216, in visit_compound_select
    text = (" " + keyword + " ").join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1218, in <genexpr>
    c._compiler_dispatch(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 2301, in _compose_select_body
    t = select._whereclause._compiler_dispatch(self, **kwargs)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1040, in visit_clauselist
    text = sep.join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1040, in <genexpr>
    text = sep.join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1040, in visit_clauselist
    text = sep.join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1040, in <genexpr>
    text = sep.join(
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1354, in visit_binary
    return disp(binary, operator_, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1471, in visit_like_op_binary
    binary.left._compiler_dispatch(self, **kw),
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1075, in visit_cast
    cast.clause._compiler_dispatch(self, **kwargs),
RecursionError: maximum recursion depth exceeded in comparison
dev-zero commented 3 years ago

I took the liberty to add the important label, because the mess this leaves is considerable (one also has to check all subprocesses).

dev-zero commented 3 years ago

@danieleongari is the computer to run the calculations configured with an SSH proxy command?

giovannipizzi commented 3 years ago

Just to report the same issue, submitting many common workflows. I have two different behaviours (probably because of the different place where the recursion limit occurs):

2021-06-11 07:03:21 [1133 | REPORT]: [6664|EquationOfStateWorkChain|run_init]: submitting `QuantumEspressoCommonRelaxWorkChain` for scale_factor `uuid: 8c5be874-7345-478e-9139-e44f3a6e00cc (pk: 6758) value: 0.94`
2021-06-11 07:04:02 [1193 | REPORT]:     [7525|PwRelaxWorkChain|setup]: No change in volume possible for the provided base input parameters. Meta convergence is turned off.
2021-06-11 07:04:02 [1194 | REPORT]:     [7525|PwRelaxWorkChain|setup]: Work chain will not run final SCF when `calculation` is set to `scf` for the relaxation `PwBaseWorkChain`.
2021-06-11 07:04:03 [1199 | REPORT]:     [7525|PwRelaxWorkChain|run_relax]: launching PwBaseWorkChain<7628>
2021-06-11 07:05:01 [1451 | REPORT]:       [7628|PwBaseWorkChain|run_process]: launching PwCalculation<9311> iteration #1
2021-06-11 07:48:44 [4902 | REPORT]:       [7628|PwBaseWorkChain|results]: work chain completed after 1 iterations
2021-06-11 07:48:44 [4903 | REPORT]:       [7628|PwBaseWorkChain|on_terminated]: remote folders will not be cleaned
2021-06-11 07:49:26 [4924 | REPORT]:     [7525|PwRelaxWorkChain|results]: workchain completed after 1 iterations
2021-06-11 07:49:29 [4928 | REPORT]:     [7525|PwRelaxWorkChain|on_terminated]: cleaned remote folders of calculations: 9311
2021-06-11 07:50:03 [4960 | REPORT]: [6664|EquationOfStateWorkChain|on_except]: Traceback (most recent call last):
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/process_states.py", line 230, in execute
    result = self.run_fn(*self.args, **self.kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/workchains/workchain.py", line 214, in _do_step
    finished, stepper_result = self._stepper.step()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/workchains.py", line 299, in step
    finished, result = self._child_stepper.step()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/workchains.py", line 250, in step
    return True, self._fn(self._workchain)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-common-workflows/aiida_common_workflows/workflows/eos.py", line 196, in run_eos
    builder, structure = self.get_sub_workchain_builder(scale_factor, reference_workchain=reference_workchain)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-common-workflows/aiida_common_workflows/workflows/eos.py", line 160, in get_sub_workchain_builder
    structure = scale_structure(self.inputs.structure, scale_factor)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/functions.py", line 179, in decorated_function
    result, _ = run_get_node(*args, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/functions.py", line 131, in run_get_node
    process = process_class(inputs=inputs, runner=runner)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 193, in __call__
    inst.transition_to(inst.create_initial_state())
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 335, in transition_to
    self.transition_failed(initial_state_label, label, *sys.exc_info()[1:])
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 351, in transition_failed
    raise exception.with_traceback(trace)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 320, in transition_to
    self._enter_next_state(new_state)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 382, in _enter_next_state
    self._fire_state_event(StateEventHook.ENTERING_STATE, next_state)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 299, in _fire_state_event
    callback(self, hook, state)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/processes.py", line 324, in <lambda>
    lambda _s, _h, state: self.on_entering(cast(process_states.State, state)),
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 380, in on_entering
    super().on_entering(state)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/processes.py", line 669, in on_entering
    call_with_super_check(self.on_create)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 376, in on_create
    self._pid = self._create_and_setup_db_record()  # pylint: disable=attribute-defined-outside-init
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 563, in _create_and_setup_db_record
    self._setup_db_record()
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/functions.py", line 362, in _setup_db_record
    super()._setup_db_record()
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 672, in _setup_db_record
    self._setup_inputs()
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 709, in _setup_inputs
    self.node.add_incoming(node, LinkType.INPUT_CALC, name)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/nodes/node.py", line 802, in add_incoming
    self.validate_incoming(source, link_type, link_label)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/nodes/process/process.py", line 472, in validate_incoming
    super().validate_incoming(source, link_type, link_label)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/utils/mixins.py", line 139, in validate_incoming
    super().validate_incoming(source, link_type=link_type, link_label=link_label)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/nodes/node.py", line 834, in validate_incoming
    if builder.count() > 0:
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/querybuilder.py", line 2193, in count
    return self._impl.count(query)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/implementation/querybuilder.py", line 290, in count
    return query.count()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3803, in count
    return self.from_self(col).scalar()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3523, in scalar
    ret = self.one()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3490, in one
    ret = self.one_or_none()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3459, in one_or_none
    ret = list(self)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1121, in _execute_clauseelement
    else None,
  File "<string>", line 1, in <lambda>
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 481, in compile
    return self._compiler(dialect, bind=bind, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 487, in _compiler
    return dialect.statement_compiler(dialect, self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 592, in __init__
    Compiled.__init__(self, dialect, statement, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 322, in __init__
    self.string = self.process(self.statement, **compile_kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 352, in process
    return obj._compiler_dispatch(self, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2202, in visit_select
    text, select, inner_columns, froms, byfrom, kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2294, in _compose_select_body
    for f in froms
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2294, in <listcomp>
    for f in froms
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1802, in visit_alias
    self, asfrom=True, **kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2202, in visit_select
    text, select, inner_columns, froms, byfrom, kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2294, in _compose_select_body
    for f in froms
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2294, in <listcomp>
    for f in froms
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2439, in visit_join
    + join.onclause._compiler_dispatch(self, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2439, in visit_join
    + join.onclause._compiler_dispatch(self, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1727, in visit_cte
    self.visit_cte(pre_alias_cte, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1759, in visit_cte
    self, asfrom=True, **kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1225, in visit_compound_select
    for i, c in enumerate(cs.selects)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1225, in <genexpr>
    for i, c in enumerate(cs.selects)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2202, in visit_select
    text, select, inner_columns, froms, byfrom, kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2301, in _compose_select_body
    t = select._whereclause._compiler_dispatch(self, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1043, in visit_clauselist
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1041, in <genexpr>
    s
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1043, in visit_clauselist
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1041, in <genexpr>
    s
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1354, in visit_binary
    return disp(binary, operator_, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1471, in visit_like_op_binary
    binary.left._compiler_dispatch(self, **kw),
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1076, in visit_cast
    cast.typeclause._compiler_dispatch(self, **kwargs),
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 955, in visit_typeclause
    return self.dialect.type_compiler.process(typeclause.type, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 402, in process
    return type_._compiler_dispatch(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 89, in _compiler_dispatch
    meth = getter(visitor)
RecursionError: maximum recursion depth exceeded while calling a Python object
2021-06-11 07:04:01 [1190 | REPORT]: [6733|EquationOfStateWorkChain|run_init]: submitting `QuantumEspressoCommonRelaxWorkChain` for scale_factor `uuid: 41b191e3-c9f5-4bb1-b038-0099d99123eb (pk: 6963) value: 0.94`
2021-06-11 07:04:21 [1296 | REPORT]:     [8136|PwRelaxWorkChain|setup]: No change in volume possible for the provided base input parameters. Meta convergence is turned off.
2021-06-11 07:04:21 [1297 | REPORT]:     [8136|PwRelaxWorkChain|setup]: Work chain will not run final SCF when `calculation` is set to `scf` for the relaxation `PwBaseWorkChain`.
2021-06-11 07:04:22 [1301 | REPORT]:     [8136|PwRelaxWorkChain|run_relax]: launching PwBaseWorkChain<8252>
2021-06-11 07:04:28 [1335 | REPORT]:       [8252|PwBaseWorkChain|run_process]: launching PwCalculation<8449> iteration #1
2021-06-11 07:49:38 [4940 | REPORT]:       [8252|PwBaseWorkChain|results]: work chain completed after 1 iterations
2021-06-11 07:49:39 [4941 | REPORT]:       [8252|PwBaseWorkChain|on_terminated]: remote folders will not be cleaned
2021-06-11 07:49:52 [4950 | REPORT]:     [8136|PwRelaxWorkChain|results]: workchain completed after 1 iterations
2021-06-11 07:49:56 [4951 | REPORT]:     [8136|PwRelaxWorkChain|on_terminated]: cleaned remote folders of calculations: 8449
2021-06-11 07:50:01 [4955 | REPORT]:   [7611|QuantumEspressoCommonRelaxWorkChain|on_except]: Traceback (most recent call last):
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/process_states.py", line 230, in execute
    result = self.run_fn(*self.args, **self.kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/workchains/workchain.py", line 214, in _do_step
    finished, stepper_result = self._stepper.step()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/workchains.py", line 299, in step
    finished, result = self._child_stepper.step()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/workchains.py", line 250, in step
    return True, self._fn(self._workchain)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-common-workflows/aiida_common_workflows/workflows/relax/quantum_espresso/workchain.py", line 49, in convert_outputs
    result = extract_from_parameters(outputs.output_parameters).values()
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/functions.py", line 179, in decorated_function
    result, _ = run_get_node(*args, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/functions.py", line 131, in run_get_node
    process = process_class(inputs=inputs, runner=runner)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 193, in __call__
    inst.transition_to(inst.create_initial_state())
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 335, in transition_to
    self.transition_failed(initial_state_label, label, *sys.exc_info()[1:])
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 351, in transition_failed
    raise exception.with_traceback(trace)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 320, in transition_to
    self._enter_next_state(new_state)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 382, in _enter_next_state
    self._fire_state_event(StateEventHook.ENTERING_STATE, next_state)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/state_machine.py", line 299, in _fire_state_event
    callback(self, hook, state)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/processes.py", line 324, in <lambda>
    lambda _s, _h, state: self.on_entering(cast(process_states.State, state)),
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 380, in on_entering
    super().on_entering(state)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/processes.py", line 669, in on_entering
    call_with_super_check(self.on_create)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 376, in on_create
    self._pid = self._create_and_setup_db_record()  # pylint: disable=attribute-defined-outside-init
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 563, in _create_and_setup_db_record
    self._setup_db_record()
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/functions.py", line 362, in _setup_db_record
    super()._setup_db_record()
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 672, in _setup_db_record
    self._setup_inputs()
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/engine/processes/process.py", line 709, in _setup_inputs
    self.node.add_incoming(node, LinkType.INPUT_CALC, name)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/nodes/node.py", line 802, in add_incoming
    self.validate_incoming(source, link_type, link_label)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/nodes/process/process.py", line 472, in validate_incoming
    super().validate_incoming(source, link_type, link_label)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/utils/mixins.py", line 139, in validate_incoming
    super().validate_incoming(source, link_type=link_type, link_label=link_label)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/nodes/node.py", line 834, in validate_incoming
    if builder.count() > 0:
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/querybuilder.py", line 2193, in count
    return self._impl.count(query)
  File "/home/pizzi/.virtualenvs/aiida-prod/codes/aiida-core/aiida/orm/implementation/querybuilder.py", line 290, in count
    return query.count()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3803, in count
    return self.from_self(col).scalar()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3523, in scalar
    ret = self.one()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3490, in one
    ret = self.one_or_none()
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3459, in one_or_none
    ret = list(self)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1121, in _execute_clauseelement
    else None,
  File "<string>", line 1, in <lambda>
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 481, in compile
    return self._compiler(dialect, bind=bind, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 487, in _compiler
    return dialect.statement_compiler(dialect, self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 592, in __init__
    Compiled.__init__(self, dialect, statement, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 322, in __init__
    self.string = self.process(self.statement, **compile_kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 352, in process
    return obj._compiler_dispatch(self, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2202, in visit_select
    text, select, inner_columns, froms, byfrom, kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2294, in _compose_select_body
    for f in froms
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2294, in <listcomp>
    for f in froms
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1802, in visit_alias
    self, asfrom=True, **kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2202, in visit_select
    text, select, inner_columns, froms, byfrom, kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2294, in _compose_select_body
    for f in froms
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2294, in <listcomp>
    for f in froms
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2439, in visit_join
    + join.onclause._compiler_dispatch(self, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2439, in visit_join
    + join.onclause._compiler_dispatch(self, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1727, in visit_cte
    self.visit_cte(pre_alias_cte, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1759, in visit_cte
    self, asfrom=True, **kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1225, in visit_compound_select
    for i, c in enumerate(cs.selects)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1225, in <genexpr>
    for i, c in enumerate(cs.selects)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2202, in visit_select
    text, select, inner_columns, froms, byfrom, kwargs
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 2301, in _compose_select_body
    t = select._whereclause._compiler_dispatch(self, **kwargs)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1043, in visit_clauselist
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1041, in <genexpr>
    s
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1043, in visit_clauselist
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1041, in <genexpr>
    s
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1354, in visit_binary
    return disp(binary, operator_, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1471, in visit_like_op_binary
    binary.left._compiler_dispatch(self, **kw),
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 1076, in visit_cast
    cast.typeclause._compiler_dispatch(self, **kwargs),
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 955, in visit_typeclause
    return self.dialect.type_compiler.process(typeclause.type, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/compiler.py", line 402, in process
    return type_._compiler_dispatch(self, **kw)
  File "/home/pizzi/.virtualenvs/aiida-prod/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
RecursionError: maximum recursion depth exceeded in comparison

2021-06-11 07:50:57 [5002 | REPORT]: [6733|EquationOfStateWorkChain|inspect_init]: Initial sub process did not finish successful so aborting the workchain.

The validate_incoming function of a work function seems to be (?) a common denominator of many of these reports (also of others?)

giovannipizzi commented 3 years ago

@chrisjsewell can you please look into this? This seems like a blocker and important issue...

chrisjsewell commented 3 years ago

@chrisjsewell can you please look into this? This seems like a blocker and important issue...

Indeed šŸ‘ Iā€™m going to do a bunch of aiida-core stuff next week

sphuber commented 3 years ago

The validate_incoming function of a work function seems to be (?)

Not sure. Looking at the original example in the OP happens when a Process gets deserialized from a process checkpoint. The examples by @dev-zero are in one case a calcfunction that is called, but the other example is the submission of a process. Both those cases point to the instantiation of a process, but are still different from the original case which excepts while loading a simple node from the database.

giovannipizzi commented 3 years ago

BTW:

Since I don't think there is any sensitive information, I attach the full log (I started running from a clean profile yesterday, so the log should contain all info, not cluttered with too much else): daemon.log.zip

RobinHilg commented 3 years ago

When performing FLEUR scandium relaxation I ran into this error which Vasily described yesterday in the mailing list. Here you find a full traceback:

2021-08-10 13:44:18 [34571 | REPORT]: [55184|FleurCreateMagneticWorkChain|start]: INFO: started Create Magnetic Film workflow version 0.2.0

2021-08-10 13:44:18 [34572 | REPORT]: [55184|FleurCreateMagneticWorkChain|start]: INFO: EOS workchain will be submitted
2021-08-10 13:44:18 [34573 | REPORT]: [55184|FleurCreateMagneticWorkChain|run_eos]: INFO: submit EOS WorkChain
2021-08-10 13:44:18 [34574 | REPORT]: [55184|FleurCreateMagneticWorkChain|on_except]: Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/plumpy/process_states.py", line 230, in execute
    result = self.run_fn(*self.args, **self.kwargs)
  File "/opt/aiida-core/aiida/engine/processes/workchains/workchain.py", line 214, in _do_step
    finished, stepper_result = self._stepper.step()
  File "/usr/local/lib/python3.8/dist-packages/plumpy/workchains.py", line 299, in step
    finished, result = self._child_stepper.step()
  File "/usr/local/lib/python3.8/dist-packages/plumpy/workchains.py", line 432, in step
    finished, retval = self._child_stepper.step()
  File "/usr/local/lib/python3.8/dist-packages/plumpy/workchains.py", line 299, in step
    finished, result = self._child_stepper.step()
  File "/usr/local/lib/python3.8/dist-packages/plumpy/workchains.py", line 250, in step
    return True, self._fn(self._workchain)
  File "/opt/aiida-fleur/aiida_fleur/workflows/create_magnetic_film.py", line 124, in run_eos
    inputs, error = self.prepare_eos()
  File "/opt/aiida-fleur/aiida_fleur/workflows/create_magnetic_film.py", line 111, in prepare_eos
    inputs.structure = create_substrate_bulk(Dict(dict=self.ctx.wf_dict))
  File "/opt/aiida-core/aiida/engine/processes/functions.py", line 179, in decorated_function
    result, _ = run_get_node(*args, **kwargs)
  File "/opt/aiida-core/aiida/engine/processes/functions.py", line 131, in run_get_node
    process = process_class(inputs=inputs, runner=runner)
  File "/usr/local/lib/python3.8/dist-packages/plumpy/base/state_machine.py", line 193, in __call__
    inst.transition_to(inst.create_initial_state())
  File "/usr/local/lib/python3.8/dist-packages/plumpy/base/state_machine.py", line 335, in transition_to
    self.transition_failed(initial_state_label, label, *sys.exc_info()[1:])
  File "/usr/local/lib/python3.8/dist-packages/plumpy/base/state_machine.py", line 351, in transition_failed
    raise exception.with_traceback(trace)
  File "/usr/local/lib/python3.8/dist-packages/plumpy/base/state_machine.py", line 320, in transition_to
    self._enter_next_state(new_state)
  File "/usr/local/lib/python3.8/dist-packages/plumpy/base/state_machine.py", line 382, in _enter_next_state
    self._fire_state_event(StateEventHook.ENTERING_STATE, next_state)
  File "/usr/local/lib/python3.8/dist-packages/plumpy/base/state_machine.py", line 299, in _fire_state_event
    callback(self, hook, state)
  File "/usr/local/lib/python3.8/dist-packages/plumpy/processes.py", line 324, in <lambda>
    lambda _s, _h, state: self.on_entering(cast(process_states.State, state)),
  File "/opt/aiida-core/aiida/engine/processes/process.py", line 380, in on_entering
    super().on_entering(state)
  File "/usr/local/lib/python3.8/dist-packages/plumpy/processes.py", line 669, in on_entering
    call_with_super_check(self.on_create)
  File "/usr/local/lib/python3.8/dist-packages/plumpy/base/utils.py", line 29, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/opt/aiida-core/aiida/engine/processes/process.py", line 376, in on_create
    self._pid = self._create_and_setup_db_record()  # pylint: disable=attribute-defined-outside-init
  File "/opt/aiida-core/aiida/engine/processes/process.py", line 563, in _create_and_setup_db_record
    self._setup_db_record()
  File "/opt/aiida-core/aiida/engine/processes/functions.py", line 362, in _setup_db_record
    super()._setup_db_record()
  File "/opt/aiida-core/aiida/engine/processes/process.py", line 672, in _setup_db_record
    self._setup_inputs()
  File "/opt/aiida-core/aiida/engine/processes/process.py", line 709, in _setup_inputs
    self.node.add_incoming(node, LinkType.INPUT_CALC, name)
  File "/opt/aiida-core/aiida/orm/nodes/node.py", line 802, in add_incoming
    self.validate_incoming(source, link_type, link_label)
  File "/opt/aiida-core/aiida/orm/nodes/process/process.py", line 472, in validate_incoming
    super().validate_incoming(source, link_type, link_label)
  File "/opt/aiida-core/aiida/orm/utils/mixins.py", line 139, in validate_incoming
    super().validate_incoming(source, link_type=link_type, link_label=link_label)
  File "/opt/aiida-core/aiida/orm/nodes/node.py", line 834, in validate_incoming
    if builder.count() > 0:
  File "/opt/aiida-core/aiida/orm/querybuilder.py", line 2193, in count
    return self._impl.count(query)
  File "/opt/aiida-core/aiida/orm/implementation/querybuilder.py", line 290, in count
    return query.count()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3803, in count
    return self.from_self(col).scalar()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3523, in scalar
    ret = self.one()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3490, in one
    ret = self.one_or_none()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3459, in one_or_none
    ret = list(self)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1115, in _execute_clauseelement
    compiled_sql = elem.compile(
  File "<string>", line 1, in <lambda>
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py", line 481, in compile
    return self._compiler(dialect, bind=bind, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py", line 487, in _compiler
    return dialect.statement_compiler(dialect, self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 592, in __init__
    Compiled.__init__(self, dialect, statement, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 322, in __init__
    self.string = self.process(self.statement, **compile_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 352, in process
    return obj._compiler_dispatch(self, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2292, in _compose_select_body
    [
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2293, in <listcomp>
    f._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1801, in visit_alias
    ret = alias.original._compiler_dispatch(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2292, in _compose_select_body
    [
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2293, in <listcomp>
    f._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2435, in visit_join
    join.left._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2437, in visit_join
    + join.right._compiler_dispatch(self, asfrom=True, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1727, in visit_cte
    self.visit_cte(pre_alias_cte, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1758, in visit_cte
    cte.original._compiler_dispatch(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1216, in visit_compound_select
    text = (" " + keyword + " ").join(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1218, in <genexpr>
    c._compiler_dispatch(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2201, in visit_select
    text = self._compose_select_body(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 2301, in _compose_select_body
    t = select._whereclause._compiler_dispatch(self, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1040, in visit_clauselist
    text = sep.join(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1040, in <genexpr>
    text = sep.join(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1040, in visit_clauselist
    text = sep.join(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1040, in <genexpr>
    text = sep.join(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1043, in <genexpr>
    c._compiler_dispatch(self, **kw) for c in clauselist.clauses
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1354, in visit_binary
    return disp(binary, operator_, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1471, in visit_like_op_binary
    binary.left._compiler_dispatch(self, **kw),
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch
    return meth(self, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/compiler.py", line 1075, in visit_cast
    cast.clause._compiler_dispatch(self, **kwargs),
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/annotation.py", line 79, in _compiler_dispatch
    return self.__element.__class__._compiler_dispatch(self, visitor, **kw)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/visitors.py", line 89, in _compiler_dispatch
    meth = getter(visitor)
RecursionError: maximum recursion depth exceeded while calling a Python object

FleurCreateMagneticWorkChain<55184> Excepted [1:if_(eos_needed)]

This and similar tracebacks occur on various points of the work chain.

chrisjsewell commented 3 years ago

Thanks, I think this error is maybe different to the one opened for this issue. For this one, at least for initial debugging, I would think to maybe add to: https://github.com/aiidateam/aiida-core/blob/4174e5de3adbeec785290a02a0fc78d4597e42e0/aiida/orm/nodes/node.py#L455-L456

something like:

            try:
                count = builder.count()
            except Exception as exc:
                raise ValueError(f'the link ({source} -> {self}) would result in an erroneous query') from exc
            if count > 0:
                raise ValueError(f'the link you are attempting to create ({source} -> {self}) would generate a cycle in the graph')

at least then we could see what nodes are trying to be linked

unkcpz commented 2 years ago

I got a chance to reproduce this issue and I add the exception trace as @chrisjsewell suggested. Here is the traces I got. It is basically lots of kpoints KeyError followed with RecursionError describe in this issue:

Traceback (most recent call last):                                                                                                                                                                                                         
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/utils.py", line 128, in __getattr__                                                                                                                    
    return self[attr]                                                                                                                                                                                                                      
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/utils.py", line 85, in __getitem__                                                                                                                     
    return self._dict[key]                                                                                                                                                                                                                 
KeyError: 'kpoints'                                                                                                                                                                                                                        

During handling of the above exception, another exception occurred:                                                                                                                                                                        

Traceback (most recent call last):                                                                                                                                                                                                         
  File "/home/jyu/Projects/WP-SSSP/aiida-quantumespresso/aiida_quantumespresso/workflows/pw/base.py", line 273, in validate_kpoints                                                                                                        
    kpoints = self.inputs.kpoints                                                                                                                                                                                                          
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/utils.py", line 131, in __getattr__                                                                                                                    
    raise AttributeError(errmsg)                                                                                                                                                                                                           
AttributeError: 'AttributesFrozendict' object has no attribute 'kpoints'   

During handling of the above exception, another exception occurred: 

Traceback (most recent call last):                                                                                                                                                                                                         
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/orm/nodes/node.py", line 835, in validate_incoming                                                                                                      
    count = builder.count()                                                                                                                                                                                                                
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/orm/querybuilder.py", line 2193, in count                                                                                                               
    return self._impl.count(query)                                                                                                                                                                                                         
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/orm/implementation/querybuilder.py", line 290, in count                                                                                                 
    return query.count()                                                                                                                                                                                                                   
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3803, in count                
...

  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 96, in _compiler_dispatch                                                                                                   
    return meth(self, **kw)                                                                                                                                                                                                                
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/sqlalchemy/sql/compiler.py", line 1075, in visit_cast                                                                                                         
    cast.clause._compiler_dispatch(self, **kwargs),                                                                                                                                                                                        
RecursionError: maximum recursion depth exceeded in comparison                                                                                                                                                                             

The above exception was the direct cause of the following exception:                                                                                                                                                                       

Traceback (most recent call last):                                                                                                                                                                                                         
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/process_states.py", line 231, in execute                                                                                                               
    result = self.run_fn(*self.args, **self.kwargs)                                                                                                                                                                                        
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/engine/processes/workchains/workchain.py", line 214, in _do_step                                                                                        
    finished, stepper_result = self._stepper.step()                                                                                                                                                                                        
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/workchains.py", line 299, in step                                                                                                                      
    finished, result = self._child_stepper.step()                                                                                                                                                                                          
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/workchains.py", line 532, in step                                                                                                                      
    finished, result = self._child_stepper.step()                                                                                                                                                                                          
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/workchains.py", line 299, in step                                                                                                                      
    finished, result = self._child_stepper.step()                                                                                                                                                                                          
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/workchains.py", line 250, in step                                                                                                                      
    return True, self._fn(self._workchain)                                                              

...

  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/engine/processes/process.py", line 709, in _setup_inputs
    self.node.add_incoming(node, LinkType.INPUT_CALC, name)
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/orm/nodes/node.py", line 802, in add_incoming
    self.validate_incoming(source, link_type, link_label)
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/orm/nodes/process/process.py", line 472, in validate_incoming
    super().validate_incoming(source, link_type, link_label)
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/orm/utils/mixins.py", line 139, in validate_incoming
    super().validate_incoming(source, link_type=link_type, link_label=link_label)
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/aiida/orm/nodes/node.py", line 837, in validate_incoming
    raise ValueError(f'the link ({source} -> {self}) would result in an erroneous query') from exc
ValueError: the link (Remote code 'pw-6.8' on eiger-hq, pk: 1, uuid: dc38cb79-defb-4fc9-a859-58e44aabefe7 -> uuid: 799e0295-a8b7-4c81-98b1-5511c87d0c21 (unstored) (aiida.calculations:quantumespresso.pw)) would result in an erroneous query  
unkcpz commented 2 years ago

Please let me know if you have any idea on how to further debug this @chrisjsewell

sphuber commented 2 years ago

Can the people that reported a case of this please indicate whether they were using the Django or SqlAlchemy backend? I wonder if this has something to do with the SqlAlchemy engine being created for the QueryBuilder in the Django backend and not being reset correctly. So I wonder if this mostly occurs for Django backends, and not the SqlAlchemy ones. That would be a great hint.

unkcpz commented 2 years ago

I use Django in all where this issue happened.

giovannipizzi commented 2 years ago

Also Django for me

mbercx commented 2 years ago

Same: Django backend

sphuber commented 2 years ago

Got another hit of this, but on develop (commit 09765ecbd8280b629da5a804bdaa6353fc0a3179) with an SqlAlchemy backend (is the only remaining). Just running roughly 300 processes in parallel. Will try to see if it happens again when lowering the amount of concurrent processes.

Traceback (most recent call last):
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/kiwipy/rmq/tasks.py", line 166, in _on_task
    result = await result
  File "/usr/lib/python3.9/asyncio/futures.py", line 284, in __await__
    yield self  # This tells Task to wait for completion.
  File "/usr/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup
    future.result()
  File "/usr/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/kiwipy/rmq/threadcomms.py", line 284, in done
    result = kiwi_future.result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/kiwipy/futures.py", line 54, in capture_exceptions
    yield
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/communications.py", line 48, in on_done
    result = plum_future.result()
  File "/usr/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/kiwipy/futures.py", line 54, in capture_exceptions
    yield
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/futures.py", line 73, in run_task
    res = await coro()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/process_comms.py", line 539, in __call__
    return await self._continue(communicator, **task.get(TASK_ARGS, {}))
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/manage/external/rmq.py", line 208, in _continue
    result = await super()._continue(communicator, pid, nowait, tag)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/process_comms.py", line 615, in _continue
    return proc.future().result()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/process_states.py", line 231, in execute
    result = self.run_fn(*self.args, **self.kwargs)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/engine/processes/workchains/workchain.py", line 252, in _do_step
    finished, stepper_result = self._stepper.step()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/workchains.py", line 299, in step
    finished, result = self._child_stepper.step()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/workchains.py", line 250, in step
    return True, self._fn(self._workchain)
  File "/home/sph/code/aiida/env/dev/aiida-codtools/aiida_codtools/workflows/cif_clean.py", line 103, in run_select_calculation
    calculation = self.submit(CifSelectCalculation, **inputs)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/engine/processes/process.py", line 509, in submit
    return self.runner.submit(process, *args, **kwargs)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/engine/runners.py", line 183, in submit
    process_inited = self.instantiate_process(process, *args, **inputs)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/engine/runners.py", line 169, in instantiate_process
    return instantiate_process(self, process, *args, **inputs)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/engine/utils.py", line 65, in instantiate_process
    process = process_class(runner=runner, inputs=inputs)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 193, in __call__
    inst.transition_to(inst.create_initial_state())
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 335, in transition_to
    self.transition_failed(initial_state_label, label, *sys.exc_info()[1:])
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 351, in transition_failed
    raise exception.with_traceback(trace)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 320, in transition_to
    self._enter_next_state(new_state)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 382, in _enter_next_state
    self._fire_state_event(StateEventHook.ENTERING_STATE, next_state)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 299, in _fire_state_event
    callback(self, hook, state)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/processes.py", line 324, in <lambda>
    lambda _s, _h, state: self.on_entering(cast(process_states.State, state)),
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/engine/processes/process.py", line 391, in on_entering
    super().on_entering(state)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/processes.py", line 669, in on_entering
    call_with_super_check(self.on_create)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check
    wrapped(*args, **kwargs)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/engine/processes/process.py", line 387, in on_create
    self._pid = self._create_and_setup_db_record()  # pylint: disable=attribute-defined-outside-init
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/engine/processes/process.py", line 577, in _create_and_setup_db_record
    self.node.store_all()
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/nodes/node.py", line 678, in store_all
    return self.store(with_transaction)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/nodes/node.py", line 715, in store
    self._store(with_transaction=with_transaction, clean=True)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/nodes/node.py", line 746, in _store
    self._backend_entity.set_extra(self._HASH_EXTRA_KEY, self.get_hash())
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/nodes/node.py", line 811, in get_hash
    return self._get_hash(ignore_errors=ignore_errors, **kwargs)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/nodes/node.py", line 822, in _get_hash
    return make_hash(self._get_objects_to_hash(), **kwargs)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/nodes/process/calculation/calcjob.py", line 134, in _get_objects_to_hash
    for entry in self.get_incoming(link_type=(LinkType.INPUT_CALC, LinkType.INPUT_WORK))
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/nodes/node.py", line 604, in get_incoming
    link_triples = self.get_stored_link_triples(
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/nodes/node.py", line 579, in get_stored_link_triples
    return [LinkTriple(entry[0], LinkType(entry[1]), entry[2]) for entry in builder.all()]
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/querybuilder.py", line 1071, in all
    matches = list(self.iterall(batch_size=batch_size))
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/orm/querybuilder.py", line 1033, in iterall
    for item in self._impl.iterall(self.as_dict(), batch_size):
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/storage/psql_dos/orm/querybuilder/main.py", line 172, in iterall
    with self.use_query(data) as query:
  File "/usr/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/storage/psql_dos/orm/querybuilder/main.py", line 204, in use_query
    query = self._update_query(data)
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/storage/psql_dos/orm/querybuilder/main.py", line 226, in _update_query
    self._build()
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/storage/psql_dos/orm/querybuilder/main.py", line 292, in _build
    result = join_func(
  File "/home/sph/code/aiida/env/dev/aiida-core/aiida/storage/psql_dos/orm/querybuilder/joiner.py", line 349, in _join_node_inputs
    ).join(entity_to_join, aliased_edge.input_id == entity_to_join.id, isouter=isouterjoin)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/operators.py", line 360, in __eq__
    return self.operate(eq, other)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 317, in operate
    return op(self.comparator, *other, **kwargs)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/operators.py", line 360, in __eq__
    return self.operate(eq, other)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/orm/properties.py", line 431, in operate
    return op(self.__clause_element__(), *other, **kwargs)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/annotation.py", line 221, in __eq__
    return self.__element.__class__.__eq__(self, other)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/operators.py", line 360, in __eq__
    return self.operate(eq, other)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 861, in operate
    return op(self.comparator, *other, **kwargs)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/operators.py", line 360, in __eq__
    return self.operate(eq, other)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/type_api.py", line 76, in operate
    return o[0](self.expr, op, *(other + o[1:]), **kwargs)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/default_comparator.py", line 101, in _boolean_compare
    obj = coercions.expect(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/coercions.py", line 172, in expect
    element = element.__clause_element__()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 259, in __clause_element__
    return self.expression
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 1113, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 235, in expression
    ce = self.comparator.__clause_element__()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 1227, in oneshot
    result = fn(*args, **kw)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/orm/properties.py", line 393, in _memoized_method___clause_element__
    return self.adapter(self.prop.columns[0], self.prop.key)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/orm/util.py", line 813, in _adapt_element
    self._adapter.traverse(elem)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/util.py", line 998, in traverse
    return self.columns[obj]
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/util/_collections.py", line 762, in __missing__
    self[key] = val = self.creator(self.weakself(), key)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/util.py", line 1024, in _locate_col
    c = vis.replace(col, _include_singleton_constants=True)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/util.py", line 908, in replace
    return self._corresponding_column(col, True)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/util.py", line 833, in _corresponding_column
    newcol = self.selectable.corresponding_column(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 226, in corresponding_column
    return self.exported_columns.corresponding_column(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 718, in exported_columns
    return self.columns
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 1113, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 737, in columns
    self._populate_column_collection()
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 1643, in _populate_column_collection
    self.element._generate_fromclause_column_proxies(self)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 694, in _generate_fromclause_column_proxies
    fromclause._columns._populate_separate_keys(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/base.py", line 1293, in _populate_separate_keys
    cols = list(iter_)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 695, in <genexpr>
    col._make_proxy(fromclause) for col in self.c
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 1969, in _make_proxy
    c = self._constructor(
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 1670, in __init__
    self._init_items(*args)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 135, in _init_items
    spwd(self, **kw)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/base.py", line 1046, in _set_parent_with_dispatch
    self._set_parent(parent, **kw)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 2445, in _set_parent
    self.parent._on_table_attach(self._set_table)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 1879, in _on_table_attach
    event.listen(self, "after_parent_attach", fn)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/event/api.py", line 115, in listen
    _event_key(target, identifier, fn).listen(*args, **kw)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/event/api.py", line 25, in _event_key
    tgt = evt_cls._accept_with(target)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/event/base.py", line 245, in _accept_with
    if hasattr(target, "dispatch"):
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/event/base.py", line 321, in __get__
    disp = self.dispatch._for_instance(obj)
  File "/home/sph/.virtualenvs/aiida_dev/lib/python3.9/site-packages/sqlalchemy/event/base.py", line 127, in _for_instance
    return self._for_class(instance_cls)
RecursionError: maximum recursion depth exceeded
sphuber commented 2 years ago

Note that when running with 30 processes in parallel (single daemon worker) I no longer experienced any problems. Even after having run roughly 45k processes. So I have the feeling the problem appears only when under heavy load. Not sure if it is necessary to have multiple daemon workers, or simply have many processes active in one worker. Will try to run many processes in parallel with just a single worker.

sphuber commented 2 years ago

Can confirm that I reproduced the problem with just a single daemon worker under heavy load. So it doesn't seem to require multiple active workers.

kjappelbaum commented 2 years ago

Can confirm that I reproduced the problem with just a single daemon worker under heavy load. So it doesn't seem to require multiple active workers.

I also just ran into this (aiida-core== 1.6.8, sqlalchemy==1.3.24, kiwipy== 0.7.5, Python=3.9.12 on an Ubuntu 20 server) and, at least in my case, it seemed to be correlated to also running out of slots.

mbercx commented 1 year ago

Just ran into this again for Python 3.8.10, aiida-core==2.2.1, SQLAlchemy==1.4.35, kiwipy=0.7.7, plumpy==0.21.3. Same situation as described above: I submitted around 200 work chains in rapid succession with only a single active daemon worker.

unkcpz commented 1 year ago

Ran into this again and the traceback path is slightly different from what was already reported.

Click to see full traceback ```shell File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/manage/external/rmq/launcher.py", line 90, in _continue result = await super()._continue(communicator, pid, nowait, tag) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/plumpy/process_comms.py", line 604, in _continue proc = cast('Process', saved_state.unbundle(self._load_context)) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/plumpy/persistence.py", line 58, in unbundle return Savable.load(self, load_context) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/plumpy/persistence.py", line 450, in load return load_cls.recreate_from(saved_state, load_context) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/plumpy/processes.py", line 243, in recreate_from process = cast(Process, super().recreate_from(saved_state, load_context)) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/plumpy/persistence.py", line 475, in recreate_from call_with_super_check(obj.load_instance_state, saved_state, load_context) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check wrapped(*args, **kwargs) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/engine/processes/workchains/workchain.py", line 166, in load_instance_state super().load_instance_state(saved_state, load_context) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/engine/processes/process.py", line 311, in load_instance_state super().load_instance_state(saved_state, load_context) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/plumpy/processes.py", line 634, in load_instance_state decoded = self.decode_input_args(saved_state[BundleKeys.INPUTS_RAW]) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/engine/processes/process.py", line 645, in decode_input_args return serialize.deserialize_unsafe(encoded) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/utils/serialize.py", line 229, in deserialize_unsafe return yaml.load(serialized, Loader=AiiDALoader) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/__init__.py", line 81, in load return loader.get_single_data() File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 51, in get_single_data return self.construct_document(node) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 55, in construct_document data = self.construct_object(node) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 100, in construct_object data = constructor(self, node) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/utils/serialize.py", line 130, in mapping_constructor yaml_node = loader.construct_mapping(mapping, deep=True) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 218, in construct_mapping return super().construct_mapping(node, deep=deep) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 143, in construct_mapping value = self.construct_object(value_node, deep=deep) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 107, in construct_object for dummy in generator: File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 413, in construct_yaml_map value = self.construct_mapping(node) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 218, in construct_mapping return super().construct_mapping(node, deep=deep) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 143, in construct_mapping value = self.construct_object(value_node, deep=deep) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/yaml/constructor.py", line 100, in construct_object data = constructor(self, node) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/utils/serialize.py", line 86, in node_constructor return orm.load_node(uuid=yaml_node) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/utils/loaders.py", line 206, in load_node return load_entity( File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/utils/loaders.py", line 88, in load_entity return entity_loader.load_entity( File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/utils/loaders.py", line 407, in load_entity entity = builder.one()[0] File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/querybuilder.py", line 1107, in one res = self.all() File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/querybuilder.py", line 1088, in all matches = list(self.iterall(batch_size=batch_size)) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/orm/querybuilder.py", line 1050, in iterall for item in self._impl.iterall(self.as_dict(), batch_size): File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/storage/psql_dos/orm/querybuilder/main.py", line 165, in iterall with self.query_session(data) as build: File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/contextlib.py", line 119, in __enter__ return next(self.gen) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/storage/psql_dos/orm/querybuilder/main.py", line 233, in query_session query = self.get_query(data) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/storage/psql_dos/orm/querybuilder/main.py", line 225, in get_query self._query_cache = self._build(data) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/aiida/storage/psql_dos/orm/querybuilder/main.py", line 281, in _build query = query.add_entity(projection) File "", line 2, in add_entity File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/base.py", line 110, in _generative x = fn(self, *args, **kw) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 1124, in add_entity coercions.expect( File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/coercions.py", line 188, in expect resolved = insp.__clause_element__() File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 1134, in oneshot result = fn(self, *args, **kw) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/orm/util.py", line 758, in __clause_element__ return self.selectable._annotate( File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/annotation.py", line 104, in _annotate return Annotated(self, values) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 6949, in __init__ element.c File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 1113, in __get__ obj.__dict__[self.__name__] = result = self.fget(obj) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 737, in columns self._populate_column_collection() File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 1643, in _populate_column_collection self.element._generate_fromclause_column_proxies(self) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 694, in _generate_fromclause_column_proxies fromclause._columns._populate_separate_keys( File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/base.py", line 1294, in _populate_separate_keys cols = list(iter_) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 695, in col._make_proxy(fromclause) for col in self.c File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 2086, in _make_proxy c = self._constructor( File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 1767, in __init__ self._init_items(*args) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 144, in _init_items spwd(self, **kw) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/base.py", line 1047, in _set_parent_with_dispatch self._set_parent(parent, **kw) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 2577, in _set_parent self.parent._on_table_attach(self._set_table) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 1987, in _on_table_attach event.listen(self, "after_parent_attach", fn) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/event/api.py", line 115, in listen _event_key(target, identifier, fn).listen(*args, **kw) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/event/api.py", line 25, in _event_key tgt = evt_cls._accept_with(target) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/event/base.py", line 245, in _accept_with if hasattr(target, "dispatch"): File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/event/base.py", line 321, in __get__ disp = self.dispatch._for_instance(obj) File "/home/jyu/micromamba/envs/aiida-opsp/lib/python3.9/site-packages/sqlalchemy/event/base.py", line 127, in _for_instance return self._for_class(instance_cls) RecursionError: maximum recursion depth exceeded ```

But there are same patterns of all exceptions which point to the aiida/orm/querybuilder.py and I found same issue can be caused from the query sql is not built properly, see here.

As suggested by @giovannipizzi, I update the https://github.com/aiidateam/aiida-integration-tests/tree/update-for-maxmium-recursion-issue and use the aiida-core branch https://github.com/aiidateam/aiida-core/compare/main...max-recusion-limit which set in daemon a very low recursion limit (by sys.setrecursionlimit(100) , I playaround with this value and find the critical value for workchain to be able to successfully run is 122, I didn't test it but I guess this value related to how complex the workchain is, such as how many sub workchain or the nested workchain or return value in provenance graph). The "maximum recursion depth reached" can be hit by running a test workchain with:

aiida-sleep workchain -nw 10 -nc 10 -t 10 -p 10000 -o 10000 -a 10000 --submit

But running a calcjob is all fine.

aiida-sleep calc -n 1 -t 10 -p 10000 -o 10000 -a 10000 --submit

The code path in traceback is different but again from aiida/orm/querybuilder.py it starts to transfer to sqlalchemy/sql/ and raised the issue. I think the next action before starting to look at the querybuilder is to increase the recursion limitation and let the issue manifest with a more complex nested workchain.

chrisjsewell commented 1 year ago

See #6044 for the proposed "fix"

Looking at all the various posted exceptions above, they all seem to be slightly different, and I don't see an obvious (fixable) issue, e.g. any incidence of infinite recursions. Just (somewhat natural) complexity in the call stack, which would probably only really be fixable by doing fairly major redesigns to make the call stack "shallower" in various places, e.g within

which set in daemon a very low recursion limit by sys.setrecursionlimit(100)

I don't feel that setting such a low recursion limit (compared to the default 1000) is replicating the production issues, its just creating a lot more "artificial failure points". For example, if you set sys.setrecursionlimit(100) in aiida/__init__.py you'll even get recursion errors for simply trying to call verdi

If you were to set it higher than default e.g. sys.setrecursionlimit(3000) (as is proposed for #6044) and you were still getting recursion exceptions, or even stack overflow exception, then that's when we would really need to think about structural changes to the code

chrisjsewell commented 1 year ago

FYI, I was also trying to see if increasing the number of async tasks running would have an effect on the recursion limit for any one task. But I don't believe this to be the case, for example:

import asyncio
import sys

sys.setrecursionlimit(70)

number_of_tasks = 10
recursion_depth = 100

async def recursive_task(task_name, counter):
    if counter <= 0:
        return

    print(f"Task {task_name} - Counter: {counter}")

    # Simulating some asynchronous work
    await asyncio.sleep(.1)

    # Recursive call
    await recursive_task(task_name, counter - 1)

async def main():
    tasks = [
        asyncio.create_task(recursive_task(f"Task {i}", recursion_depth))
        for i in range(number_of_tasks)
    ]

    # Wait for all tasks to complete
    await asyncio.gather(*tasks)

asyncio.run(main())

will hit the RecursionError at the same recursion depth, irrespective of how many tasks are started.

So I'm really not sure why putting more load on a daemon worker (i.e. having more async tasks running) would make it more likely for the recursion limit to be exceeded by a single task

unkcpz commented 1 year ago

@chrisjsewell thanks a lot for checking this!

If you were to set it higher than default e.g. sys.setrecursionlimit(3000) (as is proposed for https://github.com/aiidateam/aiida-core/pull/6044) and you were still getting recursion exceptions, or even stack overflow exception, then that's when we would really need to think about structural changes to the code

I did it changed to 1500 and still hit the issue, I proposed to set this for daemon long ago and I remembered @sphuber is not very satisfied with the solution.

Besides, for most of the production issues reported in this thread, you can find a common pattern that has this weird kpoints related exceptions

Traceback (most recent call last):                                                                                                                                                                                                         
  File "/home/jyu/Projects/WP-SSSP/aiida-quantumespresso/aiida_quantumespresso/workflows/pw/base.py", line 273, in validate_kpoints                                                                                                        
    kpoints = self.inputs.kpoints                                                                                                                                                                                                          
  File "/home/jyu/miniconda3/envs/aiida-sssp-dev/lib/python3.9/site-packages/plumpy/utils.py", line 131, in __getattr__                                                                                                                    
    raise AttributeError(errmsg)                                                                                                                                                                                                           
AttributeError: 'AttributesFrozendict' object has no attribute 'kpoints' 

I guess this is the KpointsData from inputs, which can be a large array and cause the problem when it was being stored to the DB.

chrisjsewell commented 1 year ago

I did it changed to 1500 and still hit the issue

But what issue? What was the trackback? What I'm not understanding at present is that, none of the trackbacks shown seem even close to 1000 stack frames long and, as mentioned above, I'm yet to find a correlation between number of tasks running and number of stack frames allowed for per function call.

Making big changes, even if possible, just to reduce a few stack frames, wouldn't appear to solve the problem. It should only be where there is a truly infinite recursion point, or at least a recursion that scales exponentially.

proposed to set this for daemon long ago and I remembered @sphuber is not very satisfied with the solution.

IPython terminals already set it to 3000, therefore there should be no issue with setting it to this

sphuber commented 1 year ago

I have gone over the various reported stack traces once again, and I have a hypothesis that maybe it is not actually related to sqlalchemy, but it may have to do with plumpy. Most of the stack traces will contain process_class(inputs=inputs, runner=runner). This will call inst.transition_to(inst.create_initial_state()) of the plumpy.base.StateMachine to create the initial state of the statemachine (which underlies the Process instance). This immediately goes to self.transition_failed(initial_state_label, label, *sys.exc_info()[1:]) meaning that an exception occurred when going to the inital state. The idea is that the process will now transition to the excepted state, but in doing so, does another state transition, and will hit the same signal catchers. In this case, it will again try to run on_create and I have an inkling that here it may hit the same exception once again and it starts all over, eventually hitting the recursion error.

If this is true, a potential solution is to prevent an exception in a state transition triggering this infinite loop. I remember addressing a similar problem in plumpy some time ago, but in this case it was when an exception was hit in the state transition to the excepted state, instead of the created state. There the solution was also to have an escape hatch. Maybe here the same problem is at hand and we need to cut the cycle in state transitions.

This is just a hunch at the moment, but I will try to find some time to do some testing and check the hypothesis. I note that the first 2 stack traces reported in this thread do not have the same trace, and so there might still be another problem, but they were also older and might have had older versions of plumpy. Difficult to say.

sphuber commented 1 year ago

IPython terminals already set it to 3000, therefore there should be no issue with setting it to this

The problem is not that an increase in the limit itself could cause issues, but rather that it is not really addressing the problem that is causing this exception, but merely hiding it. Of course, if we can safely run with the increased number and that makes these occurrences less frequent, it might be a viable temporary patch while we find a real solution for the problem.

chrisjsewell commented 1 year ago

it will again try to run on_create and I have an inkling that here it may hit the same exception once again and it starts all over

But the question is then, why does on_create not show up multiple times in the stack trace? Or is there another way to identify if a recursion "loop" has been entered, as opposed to just a particularly deep function call chain

dev-zero commented 1 year ago

@sphuber this could explain why I saw a reduction of above cases as soon as I improved SSH connection stability by transitioning from SSH proxy command to the jump host variant.

sphuber commented 1 year ago

@sphuber this could explain why I saw a reduction of above cases as soon as I improved SSH connection stability by transitioning from SSH proxy command to the jump host variant.

Why? I don't think the SSH connection is used when initializing a new process, is it? It is only used for CalcJobs and there it is only used in the upload, submit etc tasks, which are only called after the process has been fully initialized. Or do you mean something else?

dev-zero commented 1 year ago

Long time ago since I looked at this, but my guess at that time was due to the higher chance of hitting a transition to the exception state, which again fails for some reason, triggering another transition.

sphuber commented 1 year ago

But the question is then, why does on_create not show up multiple times in the stack trace? Or is there another way to identify if a recursion "loop" has been entered, as opposed to just a particularly deep function call chain

Actually, I now think the problem isn't actually with this part of the code. We see the exceptions being thrown from all sorts of different parts of the code. We have at least 3 or 4 different variants, so I don't think the problem is specific to those parts of the code, but there is a generic problem.

As for why we don't see a recursion loop in the stacktrace: you don't need an actual loop calling the same functions for a RecursionError to be thrown. You simply need to exceed the stack size for it to be thrown even if each frame on the stack comes from a different function. Take the following example:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import click

def f1():
    print('f1')
    f2()

def f2():
    print('f2')
    f3()

def f3():
    print('f3')
    f4()

def f4():
    print('f4 end of the line')

@click.command()
@click.argument('recursion_limit', default=12)
def main(recursion_limit):
    import sys
    sys.setrecursionlimit(recursion_limit)
    print(sys._current_frames())
    f1()

if __name__ == '__main__':
    main()

If you invoke it with the default recursion limit of 12, it will throw a RecursionError. Note that we don't have any loops though but different functions calling each other. Note that I need a limit of at least 12, because the invocation of the click commands that are done before I get to sys.setrecursionlimit already add roughly 10 frames to the stack.

So, conclusion: you don't need a "typical" recursion loop to hit a RecursionError, you simply need a stack trace that is deep enough, i.e., more frames on the call stack than is allowed by the setrecursionlimit.

Now comes the question why we hit this when launching many processes. My gut feeling now is that it has to do something with the event loop. The daemon worker is designed to be able to run multiple processes asynchronously, by having each process run as an asynchronous task in the event loop. Individually, the call stacks for each Process are not so deep to cause problems, but probably, "somewhere", the stack frames of a process do not get fully cleared before the interpreter switches to another event on the event loop which starts adding frames to the stack. It seems logical given that the daemon worker is just a single Python interpreter, that it has a single stack, which is being "shared" by all processes that it manages.

I will start looking in to asyncio and see if people have run into this problem in other contexts and whether we are using it incorrectly. It seems that this should be a possible use case. Not sure if we just have exceptionally deep stacks per process and many events on the event loop at the same time, or whether we have a bug in the code that causes certain frames not to be popped of the stack in time.

chrisjsewell commented 1 year ago

As for why we don't see a recursion loop in the stacktrace: you don't need an actual loop calling the same functions for a RecursionError to be thrown. You simply need to exceed the stack size for it to be thrown even if each frame on the stack comes from a different function.

yes of course I know this šŸ˜œ I think you missed my point here. I'm saying that if "it will again try to run on_create" was the cause, then surely you would see this being called multiple times in the stack trace.

Infinite recursion loops are obviously bugs that definitely need fixing. But having functions with large, but finite, recursions is obviously subject to debate how critical they are to "fix", or indeed if they even need fixing

It seems logical given that the daemon worker is just a single Python interpreter, that it has a single stack, which is being "shared" by all processes that it manages.

Are you not repeating what I already said above here? https://github.com/aiidateam/aiida-core/issues/4876#issuecomment-1572993014

i.e. as far as I can see - from googling, and asking chapgt and running example code - the recursion limit is not cumulative, its independent for each (asyncio) task

sphuber commented 1 year ago

Infinite recursion loops are obviously bugs that definitely need fixing.

Sure, but that was my point, in that here I don't think we are dealing with an actual recursion problem, but just deep call stacks.

i.e. as far as I can see - from googling, and asking chapgt and running example code - the recursion limit is not cumulative, its independent for each (asyncio) task

But then how does that explain that we only see the problem when we run a certain number of processes. It seems pretty certain that below a certain number of concurrent processes, the exception never shows up. This shows that at least some of the frames are shared or not being cleared properly and accumulate.

Are you not repeating what I already said above here? https://github.com/aiidateam/aiida-core/issues/4876#issuecomment-1572993014

In a way yes, but I am wondering if there is a way for frames to not properly get cleared from the stack and so they accumulate. So even though your test is independent of the number of tasks being executed, that may be because the actual code that they execute doesn't contain any of the "problematic code" that we have in AiiDA, and so it is not exactly simulating our problem case. What this "problematic code" is though, I have no idea. But maybe it has to do with exception handling, where we don't "fully"? handle it and so those exception handling frames keep lingering on the stack and start to accumulate.

chrisjsewell commented 1 year ago

But then how does that explain that we only see the problem when we run a certain number of processes.

Oh exactly, that is what I can't understand šŸ˜… It certainly feels like there is some correlation, but currently I can't reproduce such "shared frames" in minimal examples

chrisjsewell commented 1 year ago
image
sphuber commented 1 year ago

Sure, but what the hell does ChatGPT now? It will tell all sorts of nonsense:

Screenshot 2023-06-02 203036

sphuber commented 1 year ago

I think I may have a lead. Thinking about why it seems the asyncio tasks seem to be sharing stacks even though they shouldn't, I remember that plumpy actually hacks asyncio. By default, event loops are not reentrant. This posed a problem for process functions when @muhrin developed them. To get around this limitation, plumpy uses nest_asyncio to patch the event loop and make it reentrant. Without this patch process functions can not run and will hit the exception:

RuntimeError: This event loop is already running

In the examples of myself, @mbercx and @unkcpz which are all running PwBaseWorkChains, there is the create_kpoints_from_density calcfunction that is called. If I pass the explicit kpoints in the input instead, this function is not called, and then I no longer see any problems. I think this very strongly hints to the nest_asyncio needed for process functions being the culprit. It seems feasible that the stack is abused by making the loop reentrant and frames are not properly popped of the stack after a process function finished.

Think this is a promising avenue to look in further, although I am not quite sure exactly how to debug the frames that the process function execution adds. Ideally, if we find out this is the problem, we should really find a way to refactor process functions such that they no longer need nest_asyncio. It is a nasty hack and think it could cause other problems, if not now, maybe in the future.

sphuber commented 1 year ago

I tried running the PwBaseWorkChain with the calcfunction enabled, but with the nest_asyncio disabled. I had to make a quick hack for it to run without the nested loop, and now I no longer see the RecursionError, independent of how many workchains I launch. I think it is almost certain that the reentrant loop is the cause of the problem.