flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.47k stars 584 forks source link

[BUG] Flytekit: 'FlyteBranchNode' object has no attribute 'interface' #3628

Open tomasz-sodzawiczny opened 1 year ago

tomasz-sodzawiczny commented 1 year ago

Describe the bug

FlyteRemote.sync_execution() (and other methods that rely on it, e.g. FlyteRemote.wait() and FlyteRemote.execute() with wait=True) occasionally fails with

AttributeError: 'FlyteBranchNode' object has no attribute 'interface'

The offending line is remote.py#L1644 (I pasted full stack trace in the "Sreenshots" below).

The error only happens while the task is running (and is some specific state). Seems to happen for us mostly when the tasks inside of the conditional take some time to schedule (e.g. tasks with resource requirements that trigger cluster scale-up).

This pretty much renders wait() unusable for us, we had do our own wait wrapper that has a try..except around the sync_execution to handle this case.

Expected behavior

It shuold not raise? ;)

Additional context to reproduce

No response

Screenshots

Stack trace:

AttributeError                            Traceback (most recent call last)
Cell In[9], line 3
----> 3 flyte.sync_execution(ex, sync_nodes=True)

File [~/Library/Caches/pypoetry/virtualenvs/platform-QGRyQAM2-py3.10/lib/python3.10/site-packages/flytekit/remote/remote.py:1503](https://file+.vscode-resource.vscode-cdn.net/Users/tomasz/projects/workflows/demos/~/Library/Caches/pypoetry/virtualenvs/platform-QGRyQAM2-py3.10/lib/python3.10/site-packages/flytekit/remote/remote.py:1503), in FlyteRemote.sync_execution(self, execution, entity_definition, sync_nodes)
   1501     node_execs = {}
   1502     for n in underlying_node_executions:
-> 1503         node_execs[n.id.node_id] = self.sync_node_execution(n, node_mapping)  # noqa
   1504     execution._node_executions = node_execs
   1505 return self._assign_inputs_and_outputs(execution, execution_data, node_interface)

File [~/Library/Caches/pypoetry/virtualenvs/platform-QGRyQAM2-py3.10/lib/python3.10/site-packages/flytekit/remote/remote.py:1645](https://file+.vscode-resource.vscode-cdn.net/Users/tomasz/projects/workflows/demos/~/Library/Caches/pypoetry/virtualenvs/platform-QGRyQAM2-py3.10/lib/python3.10/site-packages/flytekit/remote/remote.py:1645), in FlyteRemote.sync_node_execution(self, execution, node_mapping)
   1639 # This is the plain ol' task execution case
   1640 else:
   1641     execution._task_executions = [
   1642         self.sync_task_execution(FlyteTaskExecution.promote_from_model(t))
   1643         for t in iterate_task_executions(self.client, execution.id)
   1644     ]
-> 1645     execution._interface = execution._node.flyte_entity.interface
   1647 self._assign_inputs_and_outputs(
   1648     execution,
   1649     node_execution_get_data_response,
   1650     execution.interface,
   1651 )
   1653 return execution

AttributeError: 'FlyteBranchNode' object has no attribute 'interface'

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

github-actions[bot] commented 8 months ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏