Closed ederenn closed 2 years ago
The issue is being closed, to be re-opened when necessary.
I have no longer encountered the issue since devnet providers have been restarted.
Before then, the investigation showed that the providers were operating under degraded performance conditions. The prevailing issue were the long database access times, which delayed each action taken by the provider agent (e.g. activity creation). Since the provider restart, further investigation was hampered and lead to examination of the following potential problems and symptoms:
exhausted hardware resources
There are multiple daemon and agent binaries running on each devnet machine. Resource utilization graphs in munin
showed that the usage during the testing session wasn't out of ordinary
dominant CPU usage by the hybrid net providers
Hybrid devnet providers were suspectible of taking the most CPU power for challenge validation in P2P communication; requesting tasks simultaneously on the beta and hybrid devnets has proven this suspicion to be false
hitting the open file descriptor limit
This investigation lead to the fix in the erc20 driver where the HTTP client connection pool is re-used by all web3 calls: https://github.com/golemfactory/yagna/pull/1892 . The fix is mainly targeting the requestor nodes and will not impact provider nodes as much
Currently, provider nodes are behaving correctly.
Name: blue yagna version: yagna 0.10.0-rc15 (160bc5a1 2022-03-14 build #206) OS+lang+version (if applicable): mac, Python 3.9.7, yapapi 0.9.0-alpha.1
yagna_rCURRENT (5).log ssh-yapapi-2022-03-15_13.06.28.log