Closed devos50 closed 6 years ago
Most of the hurt seems to actually be in the <dictcomp>
in network.py
:
We could optimize this.
I managed to reproduce this one in Gumby, using an overly aggressive walker (with walking interval 0.05 sec or 20x/sec). In this experiment with 500 nodes, 71.22% is spent on get_walkable_addresses
method. Also, 23.78% of the time is spent in b64encode
.
According to the performance graphs, it is clear that the cost of peer discovery increases over time (which probably correlates with the number of entries in the services_per_peer
dictionary):
This should give us a baseline for optimising the network.py
file.
I think we can just strip NetworkX out of network.py
, nobody is using the actual graph anyway. Also, that saves a dependency.
To address issue #78, I enabled Yappi for the TrustChain crawler and monitored CPU usage over a period of a few hours. See the following report:
The
get_walkable_addresses
method takes a significant amount of processing power (38.5%!). A breakdown shows that almost all this time is spent in the<dictcomp>
method, which in turn has the following performance:This makes walking through the network expensive in terms of CPU usage. @qstokkink is there any easy way to make this more efficient?