decred / politeia

ISC License
110 stars 75 forks source link

legacypoliteia: Add import command. #1632

Closed lukebp closed 1 year ago

lukebp commented 2 years ago

This adds an import command to the legacypoliteia tool. The import command imports the JSON data into the tstore backend that was saved to disk during the execution of the convert command.

Using these two commands (convert and import), the legacy git backend proposals can now be parsed from the politiad git repo, converted to types supported by the tstore backend, and imported into the tstore backend.

Execution of the import command takes ~7.5 hrs to complete when importing the 115 mainnet Politeia proposals from the git repo.

The performance bottleneck for this command is the trillian log server (tlog server). ~50 leaves/sec can be appended onto a tlog tree without causing any issues. This means that importing 10,000 proposal votes will take ~200 seconds (3 minutes, 20 seconds). The vast majority of the execution time of this command is spent importing proposal votes.

The command is relatively light weight. It's memory footprint should stay under 100 MiB and CPU usage should be minimal.

Data Changes

One of the main issues with importing legacy proposal data into the tstore backend is that the client and server signatures of the data are broken. For this reason, all signatures have been removed from the imported data.

The overwriteProposalFields function in legacypoliteia/proposal.go documents all changes that are being made to the legacy data and the reason for each change.

Some legacy data that was the result of bugs in the legacy politeia backend are omitted imported into the tstore backend. These include duplicate comments and duplicate proposal votes.

The original data, signatures, timestamps for the legacy proposals can be found in the git repo.

Trillian Log Server Issue

One issue that was repeatedly encountered was a bug with the tlog server when appending cast vote leaves onto a tlog tree. A small number of requests will intermittently hang. tlog will not throw any errors or give any indication that anything is wrong. tlog will sometimes require a hard restart. Upon restart of tlog, the leaves that were being appended during the bad request are usually present on the tlog tree.

This command was written to mitigate this issue as much as possible, but tlog may require a hard restart during import. The command will get stuck in a retry loop until the tlog server is reset.

Running Locally

See the example below for testing this tool locally. politeiad and politeiawww do not need to be running during the import. The trillian log server, trillian log signer, and MySQL must running in order to import the data.

Once the data has been imported, politeiad must be reset using the --fsck flag so that the caches are rebuilt to include the newly imported data.

Note, this is mainnet data and will be imported into the mainnet databases. Make sure you are running politeiad and politeiawww against mainnet, not testnet, when you turn them on.

# Clone the git repo for the mainnet politeia data

$ git clone https://github.com/decred-proposals/mainnet.git 

# Convert the legacy git backend data to tstore backend types.
# The following directory will be created during the execution
# of this command and the output will be saved to it as JSON
# formatted data.
#
# Output directory: ./legacy-politeia-data

$ legacypoliteia convert ./mainnet

# Import the converted data into the tstore backend.
#
# The --stubusers flag is used when testing locally. This
# will create stubs in the user database for all of the user
# IDs that are found in the legacy proposal data. This will
# allow you to run politeiawww without any 'user not found'
# errors when retrieving legacy data.

$ legacypoliteia import ./legacy-politeia-data --stubusers
lukebp commented 1 year ago

Two unexpected issues came up that resulted in this PR taking more time than originally anticipated.

  1. The tlog server issue that the commit message talks about.
  2. The bug that is documented in #1664.

Investigating and mitigating the tlog server issue added a substantial amount of time to PR.

lukebp commented 1 year ago

I noticed that the proposal vote numbers being reported by the imported legacy proposals differ from the proposal vote numbers reported on the proposals-archive site. The difference is usually +/- 0.1% and it appears to impact just about every proposal.

The proposal vote numbers reported by the imported legacy proposals are correct. They are parsed directly from the git repos and can be verified by running a couple of CLI commands against the legacy git repo.

Let's use this proposal as an example: https://proposals-archive.decred.org/proposals/76eba5a

The proposal votes for this proposal can be found in the legacy git repo in the ballot.journal file. mainnet/76eba5ac3ffbedc0d5d5f679a5f47693782bebaf30f66e741a70a37d4fdcef15/1/plugins/decred/ballot.journal

Manually counting the total number of votes.

ballot.journal  comments.journal
$ grep -o "votebit" ballot.journal | wc -l                                             
11361

Manually counting the number of yes votes.

$ grep -o "votebit\":\"2" ballot.journal | wc -l
10994

Manually counting the number of no votes.

$ grep -o "votebit\":\"1" ballot.journal | wc -l
367

These are the same numbers that are reported by the imported legacy proposals, which are correct, but not what is reflected on the proposals-archive site. I don't know why the numbers of the proposals-archive site are so off.

The proposals-archive site serves the vote numbers from a caching layer using a very old version of politeia. My guess is that there is an issue in the caching layer somewhere, most likely that impacts the vote numbers when the cache is rebuilt. When the git backend was in production, we would regularly check the vote numbers between what was reported on the site and what was in the git repos, so I don't think this issue ever impacted the production site.

Out of curiousity, I checked the Decred Journal to see if the vote numbers were reported for this proposal. The exact vote numbers were not, just that the proposal passed with 97% approval. https://www.publish0x.com/decredjournal/decred-journal-april-2021-xrydoqd

xaur commented 1 year ago

Thanks for checking vote count reporting in our published posts! In DJ we keep it short and often just state the approval/rejection+turnout percentages, and only add vote counts if they stand out.

However, Politeia Digest publishes vote results in greater detail and in many instances it included Yes/No vote counts. Its raw Markdown is stored here and can be grepped or even web searched for e.g. "figures".

0.1% error is not a big deal I guess, but if there is any greater error to report we can mention it in the next PD.

cc @RichardRed0x

lukebp commented 1 year ago

I spot checked a couple proposals.

Design domain budgets: 03.21 - 12.21 Politeia Digest reported 214 more yes votes than were recorded in the git repo. The no votes we're accurate. https://github.com/RichardRed0x/politeia-digest/blob/master/issue-042.md

Decred Journal 2021 Politeia Digest reported 143 more yes votes than were recorded in the git repo. The no votes we're accurate. https://github.com/RichardRed0x/politeia-digest/blob/master/issue-041.md

Decred Bug Bounty Program: Phase 3 No difference in vote counts. https://github.com/RichardRed0x/politeia-digest/blob/master/issue-033.md

TinyDecred Budget No difference in vote counts. https://github.com/RichardRed0x/politeia-digest/blob/master/issue-026.md

Older proposals appear to be accurate. Newer proposals appear to have a +/- 0.1% discrepency between reported vote counts and votes counts in the git repo. Not sure what the underlying issue was. My guess would be a concurrency bug in the caching layer that would get hit ~0.1% of the time. All of that code has been deprecated and removed from the codebase a long time ago though.