A BTreeSet seems likely to be a fairly efficient structure for performing comparisons between TCN sets. The ordering allows us to look in the right place for potential matches, and the B-Tree structure balances minimizing the amount of work performed with cache efficiency.
This PR adds a test that compares matching observed TCNs against reported ones. The test essentially runs a lightweight simulation where one user randomly encounters n other users in each TCN broadcast window according to the Bernoulli distribution (i.e., the user has probability p of observing any TCN). These TCNs are saved as the expected observations, then mixed in with other TCNs that will not be included in reports. Finally, the user expands n reports to the TCNs they reveal, performs matching, and checks that the matched TCNs are the expected ones.
The test revealed a bug in generating reports including tcn_1. The previous code inserted tck_1 rather than tck_0.
The test also reports basic timing information:
~/c/c/TCN (match-btreeset|✔) $ cargo test --release match -- --nocapture
[...snipped...]
running 1 test
Expanding candidates
Comparing 950000 candidates against 60941 observations
Took 1.780357333s (expansion) + 18.400804ms (comparison) = 1.798758137s (total)
test match_btreeset ... ok
Here we were able to generate and match close to a million TCNs on a single thread on a laptop in under two seconds. High-end mobile devices have comparable performance; low-end devices will be slower, but probably within an order of magnitude. This timing also includes verification of all of the report signatures. Signature verification can be made significantly faster using batch verification, or dropped entirely by relying on a trusted server.
A BTreeSet seems likely to be a fairly efficient structure for performing comparisons between TCN sets. The ordering allows us to look in the right place for potential matches, and the B-Tree structure balances minimizing the amount of work performed with cache efficiency.
This PR adds a test that compares matching observed TCNs against reported ones. The test essentially runs a lightweight simulation where one user randomly encounters
n
other users in each TCN broadcast window according to the Bernoulli distribution (i.e., the user has probabilityp
of observing any TCN). These TCNs are saved as the expected observations, then mixed in with other TCNs that will not be included in reports. Finally, the user expandsn
reports to the TCNs they reveal, performs matching, and checks that the matched TCNs are the expected ones.The test revealed a bug in generating reports including
tcn_1
. The previous code insertedtck_1
rather thantck_0
.The test also reports basic timing information:
Here we were able to generate and match close to a million TCNs on a single thread on a laptop in under two seconds. High-end mobile devices have comparable performance; low-end devices will be slower, but probably within an order of magnitude. This timing also includes verification of all of the report signatures. Signature verification can be made significantly faster using batch verification, or dropped entirely by relying on a trusted server.