siv-org / siv

Secure Internet Voting protocol
https://siv.org
Other
12 stars 6 forks source link

Unlocking Votes is timing out #66

Open dsernst opened 2 years ago

dsernst commented 2 years ago

While making a demo video of election https://siv.org/admin/1645223145915/voters, we simulated an election w/:

Attempting to use the Unlock 176 Votes button was failing every time.

The window would show an alert() with the message [Object object].

Update: Added clearer error msg for Timeouts, and notify admin: https://github.com/dsernst/siv/commit/421933f228b1bcf8eb7f7bd242e62909bcf11045


Tracking down in the error in the Vercel > Functions > Error Logs showed that the problem was the api/${election_id}/admin/unlock endpoint was timing out at the 10s mark.

So I added some profiling code (https://github.com/dsernst/siv/commit/2205849fcd94af2578d91934b77bc62ec4018be5) to this endpoint to see what was taking so long:

It looks like ~90% of the time is being spent generating & uploading the shuffle proofs.

dsernst commented 2 years ago

A temp fix, for recording this video, is to disable the shuffle proof code...

But we definitely need a better solution for real world elections.

The ~10s or so that it's taking isn't too bad... the problem is just that it's exceeding Vercel's serverless functions timeouts, thus making it impossible to unlock an election. My estimate is that given current profiling, and that the shuffle proof code scales linearly with the number of votes, any election with > 300 ciphertexts (e.g. 50 people voting on 6 things each) is in danger of running into this issue.

dsernst commented 2 years ago

Temporarily disabling the shuffle proofs to be able to record this demo video... https://github.com/dsernst/siv/commit/424bb37bdd8bc557158eb4d7e6f2c30db03bd11d

dsernst commented 2 years ago

Reverted the hotfix that was disabling the shuffle proofs: https://github.com/dsernst/siv/commit/f3cc6b15e2e4fd0c31ec78145d1c370db1380e22

This will unblock Verifying Observers proof confirmations for small elections.

This remains an open issue for elections w/ 300+ ciphertexts.

dsernst commented 2 years ago

I think a good solution is to outsource all this longer-running cryptography to a dedicated server. We can set up a Heroku box to handle it, that can automatically go to sleep whenever it's not needed (the vast majority of the time), so we're not paying for unused hardware.

dsernst commented 2 years ago

Or fly.io servers — can run 3 free instances all month long

dsernst commented 2 years ago

Here's that WIP branch: https://github.com/dsernst/siv/tree/demo-vid-script

dsernst commented 1 year ago

Utah sample election timing out. Manually running it locally: unlocked 176 votes in 29364ms.

It has 5 items on the ballot. So that's 176 * 5 = 880 total votes being unlocked.

dsernst commented 1 year ago

Firebase functions are another option? they have a 1 hour limit, 16gb memory, and can run npm libraries

https://firebase.google.com/docs/functions/quotas

dsernst commented 1 year ago

https://github.com/dsernst/siv/commit/416ba4897522353ea71469b2c2def9d5cad85836 now skips generating shuffle proofs if there are no other verifying observers.

Unlocked 255 votes in 7485ms.

Old: 176 5 = 880 vote items, over 29364ms, or 29364 / 880 = 33.36ms/item New: 255 5 = 1,275 vote items, over 7485ms, or 7485 / 1275 = 5.87ms/item

5.68x faster

dsernst commented 1 year ago

Initial stress tests (using new npx ts-node db-data/2023-04-22-simulate-rand-votes.ts script):

  1. 🔑 Unlocked 4 votes with 4 columns (16 ciphertexts) in 3652ms. (228.25 ms/ciphertext)
  2. 🔑 Unlocked 108 votes with 4 columns (432 ciphertexts) in 3613ms. (8.36 ms/ciphertext)
  3. 🔑 Unlocked 5104 votes with 4 columns (20416 ciphertexts) in 64464ms. (3.16 ms/ciphertext)
dsernst commented 1 year ago

Now with parallelized decryption, per column:

On my computer:

  1. 🔑 Unlocked 100 votes with 4 columns (400 ciphertexts) in 3,583ms. (8.96 ms/ciphertext)
  2. 🔑 Unlocked 5104 votes with 4 columns (20416 ciphertexts) in 77,584ms. (3.80 ms/ciphertext)

Test 2 is not faster as we were hoping. BUT! In retrospect that makes some sense because this is not actually parallelizing since it's all running on my laptop, it's still just a single next dev server process.

dsernst commented 1 year ago

Still on my local machine, but with more precise profiling:

                   init     0ms
              check jwt  2467ms
             preload db     0ms
       election exists?  2505ms
          election data     2ms
load votes, filter esig  8656ms
     remove auth tokens    15ms
                  split     8ms
            fastShuffle  5416ms
       decrypt parallel 62387ms
        store decrypted  4865ms

🔑 Unlocked 5104 votes with 4 columns (20416 ciphertexts) in 87,000ms. (4.26 ms/ciphertext)

dsernst commented 1 year ago

First attempt on Vercel, unlocking failed, but it did print this out first:

                   init     0ms
              check jwt   519ms
             preload db     3ms
       election exists?  1321ms
          election data     2ms
load votes, filter esig   977ms
     remove auth tokens    40ms
                  split    18ms
            fastShuffle 17148ms

Best guess is it's failing because of 60 second timeout.

dsernst commented 1 year ago

Success on Vercel with 2000 votes, parallelized:

                   init     0ms
              check jwt   128ms
             preload db     2ms
       election exists?   280ms
          election data     0ms
load votes, filter esig   735ms
     remove auth tokens    21ms
                  split    14ms
            fastShuffle  6906ms
       decrypt parallel 24118ms
        store decrypted  1540ms

🔑 Unlocked 2000 votes with 4 columns (8000 ciphertexts) in 33,839ms. (4.23 ms/ciphertext)

dsernst commented 1 year ago

Non parallelized code, deployed on Vercel:

Parallelized code, deployed on Vercel:

                   init     0ms
              check jwt    97ms
             preload db     2ms
       election exists?   277ms
          election data     1ms
load votes, filter esig    21ms
     remove auth tokens     1ms
                  split     0ms
            fastShuffle   385ms
       decrypt parallel  1407ms
        store decrypted   174ms

🔑 Unlocked 100 votes with 4 columns (400 ciphertexts) in 2,460ms. (6.15 ms/ciphertext)

So parallelized was:

dsernst commented 1 year ago

On Vercel: 3000 votes x 4 columns, parallelized:

                   init     0ms
              check jwt    81ms
             preload db     0ms
       election exists?   315ms
          election data     1ms
load votes, filter esig   663ms
     remove auth tokens    15ms
                  split     3ms
            fastShuffle 10004ms
       decrypt parallel 36045ms
        store decrypted  2198ms

🔑 Unlocked 3000 votes with 4 columns (12000 ciphertexts) in 49,397ms. (4.12 ms/ciphertext)

dsernst commented 1 year ago

Ok, deployed parallelization code to main branch. Reran 3000 x 4 test to be sure:

                   init     0ms
              check jwt   378ms
             preload db     1ms
       election exists?   759ms
          election data     1ms
load votes, filter esig   641ms
     remove auth tokens     8ms
                  split     3ms
            fastShuffle 10149ms
       decrypt parallel 35300ms
        store decrypted  2056ms

🔑 Unlocked 3000 votes with 4 columns (12000 ciphertexts) in 49,398ms. (4.12 ms/ciphertext)

dsernst commented 1 year ago

Vercel parallelized, 20 votes x 6 cols:

                   init     0ms
              check jwt    79ms
             preload db     1ms
       election exists?   282ms
          election data     1ms
load votes, filter esig     0ms
     remove auth tokens     0ms
                  split     0ms
            fastShuffle   150ms
       decrypt parallel  1170ms
        store decrypted   117ms

🔑 Unlocked 20 votes with 6 columns (120 ciphertexts) in 1,894ms. (15.78 ms/ciphertext)

Vercel parallelized, 1000 votes x 6 cols:

                   init     0ms
              check jwt    80ms
             preload db     1ms
       election exists?   271ms
          election data     0ms
load votes, filter esig   753ms
     remove auth tokens    34ms
                  split     2ms
            fastShuffle  5208ms
       decrypt parallel 13934ms
        store decrypted   937ms

🔑 Unlocked 1000 votes with 6 columns (6000 ciphertexts) in 21,317ms. (3.55 ms/ciphertext)

Vercel parallelized, 3000 votes x 6 cols:

                   init     0ms
              check jwt   292ms
             preload db     1ms
       election exists?   543ms
          election data     0ms
load votes, filter esig  1064ms
     remove auth tokens    33ms
                  split     5ms
            fastShuffle 14973ms
       decrypt parallel 40585ms

Timed out at 60s, after decrypting but before reporting successful decryptions stored, but on refresh, all 3k votes were indeed successfully unlocked.

dsernst commented 1 year ago

Notes on "pre-decrypting"

If (an election's only keyholder is admin@siv)
  && (admin@siv is already storing the decryption key in the db)

  then:
    admin@siv can "pre-decrypt" votes as the come in, with no adverse privacy implications
        not publishing them anywhere, just keeping them in a private part of the db

    then when election_admin hits "Unlock" btn, all the decryption is already done, greatly speeding things up