golemfactory / yagna-triage

Repository for issues which we don't yet know about enough to assign to proper repo
1 stars 0 forks source link

BatchTimeoutError while running the SSH example #194

Closed ederenn closed 2 years ago

ederenn commented 2 years ago

Name: blue yagna version: yagna 0.10.0-rc15 (160bc5a1 2022-03-14 build #206) OS+lang+version (if applicable): mac, python 3.9, yapapi 0.9.0-alpha.1

[<SshService starting on jiuzhang.t [ 0x02a71376b982cb3752e99252625aa4cc33b7ed3d ] @ 192.168.0.8>, <SshService terminated on 2rec-ubuntu.t [ 0x1b6505b1edba89ead6767305fff17a6eff09416c ] @ 192.168.0.2>]
[2022-03-15T13:26:29.372+0100 WARNING yapapi.services.service_runner] Unhandled exception in service
Traceback (most recent call last):
  File "/Users/blue/yapapi/examples/utils/__init__.py", line 85, in run_golem_example
    loop.run_until_complete(task)
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 629, in run_until_complete
    self.run_forever()
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 596, in run_forever
    self._run_once()
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 1854, in _run_once
    event_list = self._selector.select(timeout)
  File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/selectors.py", line 562, in select
    kev_list = self._selector.control(None, max_ev, timeout)
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/blue/.envs/yagna-python-tutorial/lib/python3.9/site-packages/yapapi/services/service_runner.py", line 256, in _run_instance
    batch = batch_task.result()
  File "/Users/blue/yapapi/examples/ssh/ssh.py", line 60, in start
    yield script
  File "/Users/blue/.envs/yagna-python-tutorial/lib/python3.9/site-packages/yapapi/services/service_runner.py", line 272, in _run_instance
    fut_result = yield batch
  File "/Users/blue/.envs/yagna-python-tutorial/lib/python3.9/site-packages/yapapi/engine.py", line 721, in process_batches
    results = await get_batch_results()
  File "/Users/blue/.envs/yagna-python-tutorial/lib/python3.9/site-packages/yapapi/engine.py", line 701, in get_batch_results
    async for event_class, event_kwargs in remote:
  File "/Users/blue/.envs/yagna-python-tutorial/lib/python3.9/site-packages/yapapi/rest/activity.py", line 263, in __aiter__
    raise BatchTimeoutError()
yapapi.rest.activity.BatchTimeoutError

ssh-yapapi-2022-03-15_13.25.04.log yagna_rCURRENT (4).log

mfranciszkiewicz commented 2 years ago

The issue is being closed, to be re-opened when necessary.

I have no longer encountered the issue since devnet providers have been restarted.

Before then, the investigation showed that the providers were operating under degraded performance conditions. The prevailing issue were the long database access times, which delayed each action taken by the provider agent (e.g. activity creation). Since the provider restart, further investigation was hampered and lead to examination of the following potential problems and symptoms:

Currently, provider nodes are behaving correctly.