PrivateStorageio / ZKAPAuthorizer

a Tahoe-LAFS storage-system plugin which authorizes storage operations based on privacy-respecting tokens
10 stars 7 forks source link

Implement double-use prevention in the server-side part #41

Open exarkun opened 4 years ago

exarkun commented 4 years ago

The intent is for a pass to be usable at most once. The server-side part should check for double-use as part of its validation and deny operations for which an already-used pass is being presented.

exarkun commented 4 years ago

The simplest solution would be to insert ZKAPs into a database and check for membership in the database to determine spent status. However, this database would quickly explode, I expect.

A slightly more complex solution would be to use something like a bloom filter and check for membership in that. This addresses the size concern to some great degree ("More generally, fewer than 10 bits per element are required for a 1% false positive probability, independent of the size or number of elements in the set." -Wikipedia). Possibly the probabilistic nature of Bloom filters can be accounted for by having the server accept some ratio of "maybe spent" tokens instead of rejecting requests that include any of these.

exarkun commented 3 years ago

A couple references I am clearing out of my browser state: https://medium.com/datadriveninvestor/bloom-filter-a-simple-but-interesting-data-structure-37fd53b11606 https://github.com/jaybaird/python-bloomfilter

tomprince commented 3 years ago

Presumably we'd need a shared back-end (either shared database or a new service), between all storage nodes, to ensure that a token isn't spent against multiple nodes. (Alternatively, we could size/price the ZKAPs to allow each token to be spent for each node; however, I suspect that would entail additional complexities that we should avoid.)

It isn't clear to me that a bloom-filter is an appropriate data-structure. I wouldn't think that false-positive's on having spent a token is something that is acceptable to users.

tomprince commented 3 years ago

It isn't clear to me that a bloom-filter is an appropriate data-structure. I wouldn't think that false-positive's on having spent a token is something that is acceptable to users.

That is, accepting possibly-already spent tokens is a trade-off that it would be reasonable for us to make. I don't think asking users to allow us to reject possibly-unspent tokens is a reasonable trade-off for users.

exarkun commented 3 years ago

Alternatively, we could size/price the ZKAPs to allow each token to be spent for each node; however, I suspect that would entail additional complexities that we should avoid.

One such complexity is the fact that the number of nodes cannot be considered fixed and it is not obvious to me how you would adjust the price of ZKAPs in the face of this.

It isn't clear to me that a bloom-filter is an appropriate data-structure. I wouldn't think that false-positive's on having spent a token is something that is acceptable to users.

That is, accepting possibly-already spent tokens is a trade-off that it would be reasonable for us to make. I don't think asking users to allow us to reject possibly-unspent tokens is a reasonable trade-off for users.

One mitigation we've considered for this is to adjust the price of ZKAPs so that statistically the user is still receiving the storage they expect to have paid for. That is, if the bloom filter is constructed to have a N% false positive rate then increase the number of ZKAPs issued at the chosen price point by N%. It only provides a statistically mitigation but considering the number of ZKAPs that a client will spend this seems likely to approach a real mitigation with high probability. At the very least, I think it would leave this part of the system much less deficient than other parts of the system - for example, the part which charges the same for 1 B × month of storage as it does for 1000000 B × months of storage.

meejah commented 3 years ago

At risk of muddying the waters, the "more context" for the APIs in Tahoe behind ZKAPs generally fall under a bundle of ideas and discussion usually branded "Economics Plugins". This particular API fell out of that, but it's not the only one discussed, and had wider concepts like Tahoe clients "bidding" to storage-providers etc.

Zooko has always been very insistent that previous iterations of this sort of work which gave users "too much" insight and control over these micro-transactions was a mistake (Mojo Nation, e.g.). This resulted in his concept likening it to a juke-box or similar; you put some coins in and it does stuff. The absolute amount of money is small. If it doesn't do enough stuff per coin, you stop stuffing coins in. So he has been extremely insistent on wanting a UX/UI that hides most of this and just periodically asks the user for money; if they aren't happy with the service they got already, they leave (and that's fine).

Obviously, Zooko isn't making decisions here, but I do think he has good instincts based on decades of trying to make related systems work.

So I think what I'm saying is: it seems totally consistent with ^ to use "statistics tricks" to make the overall experience good for the user. (I believe jean-paul has also previously suggested a statistical redemption strategy too). Users shouldn't even see stuff like "I had to spend one extra ZKAP on that one lease renewal because it showed as expired" etc, they just see whatever UI we reveal -- that is, the current "bar graph" thing that estimates how much storage-time is left.

(Practically-speaking: a simple database could be used "for now" and if it just had every spent ZKAP in it, everything could be inserted into a Bloom filter at any point in the future and the database deleted...)