Sybil attack to get user share revenue

dimitri-xyz commented 8 years ago

The current version of the documentation is (in my view) not yet sufficiently clear. I had a very hard time making sense of it. So, this is a preliminary assessment. A more formal outline would be useful.

This is a design bug. A malicious adversary can make 1000 "fake" ad-replacement personas that only browse the web once a month and see only 1 impression each. Through the current proposed revenue share model and those personas, the adversary would get 1000 times the revenue of the average user.

This attack shows that dividing the user revenue on a "per user" basis is subject to manipulation. I understand that the underlying motivation for this model is that:

In order to enhance privacy, the payment to each ad-replacement persona are calculated independently of the actual impressions served to each persona -- Brave Software does not keep track of which personas are served which impressions.

However, I believe the principle can still be achieved by aggregation that is not subject to this attack. One could consider a "per impression" rather than "per user" revenue share model where the number of impressions is bundled in multiples of 128 (for example) to still allow for anonymity.

burdges commented 8 years ago

Just a related aside :

Anonize seemingly does not provide post-quantum anonymity, while a simple blind signature scheme does. That's certainly no show stopper, but it should give you pause when you notice it touch such a huge dataset as "browsing history". Impressions might let you use blind signatures, which reduces complexity anyways.

It's subtle though : If you needed say Brands' blind signatures to prevent double spending asynchronously, then Anonize is probably better. It's far more likely that Brands' double spending protection would deanonymize users on the spot, maybe even in an attackable way, than that someone will build a quantum computer in 20 years and read everyone's old browsing histories from Anonize logs.

I suppose blind signatures require infrastructure you cannot realistically deploy initially, but maybe worth keeping in mind longer-term.

mrose17 commented 8 years ago

@dimitri-xyz - i agree with your meta-comment regarding the documentation it is somewhat unclear, but i'm not sure what a "more formal outline" would be, could you sketch that out? you may want to submit a new issue just on the clarity issue...

thanks for the comment. i think it is a "non-issue" though and here's why: in order for a user to take money out of the system, they have to have a verified bitcoin wallet, and the ledger allows only one user to make use of a particular phone number and/or a particular email address, and the text is missing the word unique, sorry!

@burdges - keep in mind that it's a browsing summary not a browsing history, i have updated the current version of the specification to make this clearer (i hope).

@abhvious - could you comment on @burdges note on post-quantum anonymity? i'm way out of my depth on that!

@dimitri-xyz & @burdges - please keep thos comments coming!

my thanks!

/mtr

dimitri-xyz commented 8 years ago

@mrose17 - I will create a separate issue on the documentation clarity or make a pull request with a few suggestions then. It sounds appropriate to split them up.

On the Sybil attack, it seems your argument hinges on the assumption that email addresses and/or phone numbers are expensive to get. I am not sure about phone numbers, but it is very cheap to generate a large number of email addresses.

I have any username at my domains (e.g. dimitriexample.com) being forwarded to a single large junk inbox (that I never really check). But this means that I can generate email addresses on the fly simply by generating random strings and concatenating the suffix '@dimitriexample.com'. This should show that requiring multiple email addresses will not prevent this attack.

If you plan to require every user to be reachable by a distinct phone number and the cost of phone numbers is high to the attacker (the cost of the phone numbers must be higher than the expected Brave reward, otherwise it may still be profitable), then you may thwart this attack. But I am worried this solution might just be a temporary hack and that requiring a phone number might exclude a significant number of real users.

I hope this helps! Thanks for the initative :-)

burdges commented 8 years ago

There is one Sybil attack that goes roughly as follows :

Create millions of personas using crap email addresses or VoIP providers.
Browse in ad-replacement to build up tiny balances.
Switch them into ad-free mode and browse only your own sites.

All ad networks face click fraud so this is nothing unusual and gets priced into ad costs. If anything, Brave seemingly makes click fraud slightly more complex and detectable.

Just spec out the plausible Sybil attacks. And build bots to detect weird Sybil-ish behavior like this once real money starts changing hands.

If it ever gets really messy, then do not let personas withdraw or convert from ad-replacement funds to ad-free funds, but only let ad-replacement personas contribute their funds to charities, rights organizations like the EFF, etc.

Afaik, there is no reason to require that personas have an email, phone, etc. either, just deal with it like everyone else deals with click fraud.

mrose17 commented 8 years ago

@dimitri-xyz & @burdges - the tension is that there actually is a cost (not a lot, but not insignficant) to having "control" of a phone number that does SMS, and it is effectively a requirement of AML/KYC (in the US, at least). email addresses can be amortized to the extent that they are actually free, but not phone numbers.

the cost varies, we won't know for user until we try it, etc., but it is currently believed that it will not be cost effective for a "bad guy" to participate. (famous last words, obviously... but expect that these assumptions will be fine-tuned and watched very, very carefully).

the current thinking is that if you don't verify, you can "plow" the amounts back to the publishers of your choice", but that if you want to take funds out, you have to verify. that's what it think the spec says. if i am wrong on that, please let me know!

best,

/mtr

dimitri-xyz commented 8 years ago

@mrose17 Doesn't that then give the attacker the final steps he needs? Rather than pulling the money out (thus being subject to KYC and requiring access to 1000 phone numbers), he makes a single fake publisher profile and then plows all the money into this single publisher. Now he does not need the phone numbers and can get the money. Voi lá!

mrose17 commented 8 years ago

except that pubilshers have a stricter verification process than others... viz.,

potentially large amounts are transferred to publishers -- so verification is more extensive, depending on the size and frequency of payments, e.g., similar to the verification spectrum seen for DV, OV, and EV certificates.

dimitri-xyz commented 8 years ago

I don't want to belabor the point. So, I'll just make a couple more comments:

The attacker's publisher is "legit", (like any small blogger) and the extra verification costs would have to be very significant to be higher than then 1000x multiplier.
I do believe we can deal with this situation if it arises. At the same time, I still think that a "per user" revenue share model is more likely to see this attack than a "per impression" model.

mrose17 commented 8 years ago

hi. those are fair points. i agree that an impression-based model is more accurate (and has better auditing capabilities), but the concern is that achieving privacy is much more difficult. an earlier version of the specification was based on impressions, but it requires too many levels of "double-blinding" to be satisfying with respect to being both accurate/auditable and private. i suspect that we'll have to revisit this later this year...

brave / ledger

Sybil attack to get user share revenue #1