Open cryptobench opened 11 months ago
Yagna doesn't have mechanisms to limit number of connections. They are closed after some period of time, when unused, but here you are trying to establish connections with whole network in short period of time.
First thing I would try, is to close connections made during pinging after each chunk is processed:
yagna net disconnect {node_id}
Overview
During the setup of initial data collection for the reputation system, we noticed that on the first day our EC2 instances, which were collecting the data, completely froze up.
Investigation
I started investigating the issue which I initially assumed might have been due to
while
loops that made our collection scripts continuously run (which was the goal). So, I modified the script to run once only and then have systemd run it every 30 seconds using a timer that waits for the previous run to finish before starting another one.However, this did not solve the issue. Upon further investigation using
htop
, I can see that the memory used by yagna is increasing over time. Upon launch of Yagna, the EC2 instance was using a total of 300 MB of memory. Now after almost 2 hours, we are close to using 2 GB of memory. It's slowly increasing over time.My suspicion is that it's the
yagna net ping
oryagna net find
command that is causing this issue.Scripts in Use
Uptime checker
https://github.com/golemfactory/reputation-auditor/tree/main/uptime
Here we collect offers from the network to check if a node is offline/online. If we previously received an offer from a node and it didn't send one in the past 30 seconds, then we use
yagna net find
to confirm if the node is offline or online.Ping checker
https://github.com/golemfactory/reputation-auditor/tree/main/ping-checker
Here we simply acquire a list of online nodes from the stats page and use
yagna net ping
to check the latency between the nodes and us.Setting Up the Data Collecting
There's a README included in each script, and at the bottom is the systemd config that's used. It assumes that you're using an Ubuntu EC2 instance to run it with.