cosmos / cosmos-sdk

:chains: A Framework for Building High Value Public Blockchains :sparkles:
https://cosmos.network/
Apache License 2.0
6.11k stars 3.54k forks source link

Change Tally query to be state machine based #10353

Open mattverse opened 2 years ago

mattverse commented 2 years ago

Summary

Current Tally query iterates over all the votings in the governance proposals causing latency issues to nodes or services that provide API endpoints to the public. Currently, the Tally query in the cosmos hub has an average response time of ~35.14 seconds, for an individual query, causing additional problems when such query has been made. This problem could be possibly solved by using the state machine to record the state of delegations of voting powers (possibly with the additional usage of staking hooks and other hooks) instead of iterating over all votings each time Tally query has been made.

Problem Definition

Latency issues for services running LCD nodes can be improved.

Proposal


For Admin Use

ValarDragon commented 2 years ago

Fully agreed this should be done, with planning for how it will extend to more liquid democracy setups.

We should also benchmark whats going on with the tally function, I suspect that will be a more near-term approach to a 10x improvement. My strong suspicion is that most of the overhead is fixable with fixing some code design patterns that become evident in profiling the relevant code. I suspect that the main culprits are:

Ranked by my understanding of overhead.

alexanderbez commented 5 months ago

Thanks for creating this issue @mattverse! We discussed this at decent length today. My summary of this, based on our convo and this issue, is that there are a few main ways we can tackle this, not necessary mutually exclusive:

  1. Charge an explicit additional gas fee. This ties into the general theme of "multi-dimensional" fees, which we should explore regardless. Note, however, this does NOT address the iteration bottleneck. In fact, you'll probably see less vote participation, which can be sort of a con.
  2. Introduce additional constraints to voting, e.g. https://github.com/cosmos/cosmos-sdk/pull/18186. This has the same pros/cons as (1).
  3. Refactor tallying completely. I'll describe my proposal for this below at a high level:

Refactor Tallying Proposal

My proposal is to refactor tallying completely, to completely get rid of tally iteration altogether or at the very least, limit it drastically. One way to achieve this would be to have "lazy" or cumulative tallying, i.e. tally as votes come in. Now, there are two things we want to consider when doing this. First, is that it doesn't seem we can avoid relying on total VP when accumulating votes; does this introduce any game theory attacks?. Secondly, we need to ensure that we can handle changed votes and custom tally functions.

You could achieve this using one or possibly more collections, s.t. you tally based on (proposalID, vote/weight, voter, VP)

Note, you could even introduce (1) and (3) together.