Closed wwestgarth closed 2 years ago
Playing around with some quicker/multi-threaded sorting package it turns out that the sorting is only part of the slowness -- the call to proto.Marshal()
and then to vgcrypto.Hash()
are also taking a fair chunk of time.
For banking.seen
we have a speed up in sorting times:
sort.Slice() sorty.SortSlice()
0.576144895 0.120757528
but the marshalling of the sorted array, and the hashing that byte-string to be returned by GetHash()
both take ~0.1s
so even with the speed up 2/3 of the time is spent outside of the sorting.
For evtforward.all
:
sort.Slice() sorty.SortSlice()
0.149291711 0.04777019
but again we have similar time spent of ~0.1s
on both the marshalling and the hashing. For both banking and evtforward there are around 350,000 "things" being sorted and serialised.
So switching to a different sorting algorithm does help, but not as much as thought. To be able to shave off any more time we'll have to go downt he route of spawning a go-routine for each snapshot-provider at the start of taking a snapshot, and read the results when they're done so that all the marshalling/hashing is done concurrently.
Alright, yea concurrency seems like the only way to go. We didn't do it early just to keep things simple, but we should go for it now
Closed by #5334 #5335 #5344 #5347 #5350 #5357
Spike Overview
In order to understand the issues with the performance of snapshots We will investigate the issue seen in known environments So that have a plan for how to improve the performance, and possible implement any quick wins
Acceptance Criteria
How do we know when this spike is ready to either drop or move into technical tasks:
Additional Details (optional)
Environments where performance issues have been seen:
Known potential slow areas:
Possible methods to investigate:
Some data from a recent investigation:
related: https://github.com/vegaprotocol/vega/issues/4243
It may be worth considering alternative sorting algorithms for the engines that are known to have large data-sets to sort. https://github.com/twotwotwo/sorts and https://github.com/jfcg/sorty are examples and have been breifly invetsigated:![image](https://user-images.githubusercontent.com/1857660/166420868-aa9cfd1b-1f1f-484e-a294-d1fdaed3105b.png)