Open Thunder33345 opened 5 years ago
My problem with POW is still the same: I think it's not possible to use it to block spam attacks and at the same time don't kill the user experience: Someone with a decent GPU can do 100x more calculation as a mobile phone can do, so if we request 100 minutes of calulation from a mobile user (I don't think anyone would wait that long to use the network), then a good GPU can still do it in 1 minute.
yeah POW has some scaling issues we cant have something that's relational to the device power not sure what else but it's an idea especially for me having to keep asking admins for increase in size of zerotalk maybe something like doing a POW that takes very long to unlock 1GB while yes it wont be instant but again it's not meant to block people from joining but blocking them so they cant do it all in one go
We could adopt a memory-bounded hashing function to provide such guarantee of equalities between users.
The idea is that both ASIC and GPU have very little RAM on-board, smaller than that of the Level2 cache of many processors (about 2MB), so that picking an hashing function that has to use a couple of megabytes of RAM to compute will have similar computing times no matter with which devices it is computed.
I proposed a concept of decentralized CAPTCHA for fighting spam. You can read about this idea here: https://grez911.github.io/captcha.html Happy to listen your thoughts.
@grez911 Just. Awesome. I will try to make proof of concept with your concept if you allow it.
@grez911 that's a really interesting idea i like how it needs 0 trusted node, one problem though how do we ensure everyone post "real chaptas" regarding step 4.5(solving captcha), is there anything enforcing users to go back to step 2? since we will be assuming they are attackers, they can just edit the code that forces user to go back to POW
also since the captcha isnt "related" to the process anyone could bulk fetch and precompute the chapta in time, all they do is try to use OCR and see if that hash matches as "hidden data" isnt really hidden from users who tries hard enough to attempt to view them
i do like the format and the infographics also maybe we can make a zerotalk thread or new issues for this? to not clutter here
Let's assume that we don't calculate current_post_id - 1 - random_number
, but we take the whole hash and take the remainder of dividing the hash by current_post_id
. This means that the probability of choosing each a post is equal.
Say, if there are 65536 captchas (the least possible count), and the attacker wants to post the first message, he will have to solve a captcha.
Then, when he wants to post the second message, he can either get a random captcha (and he will have to solve it), or he will do heavy PoW just to make sure that random captcha number points to the 1st spam message (because he knows the origin of its hash). When he makes more messages, it will be easier to post the next one, though.
Let's calculate the probability. If the attacker has made n messages, and he wants to make another one. There is a total of 65536 + n messages. He will get other's message with probability of 65536 / (65536 + n). Now, let's pretend that he chooses m different hashes (i.e. m different captchas). The probablity that he won't have his own captcha is (65536 / (65536 + n)) ^ m, where ^ stands for exponentiation. We want this probability to be small (because we want to make the process automated) -- let's take, say, 0.1% = 0.001. Now solve the equation (65536 / (65536 + n)) ^ m = 0.001 <=> m = log(65536 / (65536 + n))(0.001) (where log(a)(b) means logarithm of base a) <=> m = ln(0.001)/ln(65536 / (65536 + n)) = -7/ln(65536 / (65536 + n)).
Now, let's calculate how much PoW you would have to do to generate, say, 100 spam messages. After integrating this function from 1 to 100 (I tried summing up m(1..100) with Python and got similar results -- so I'll use integrating as a faster function) we get about 2 million PoW operations.
To make hashing slower, let's run 1000 iterations of SHA512 instead of one iteration of SHA256, as recommended by the document.
According to StackOverflow, you can calculate about 100 million SHA-512 hashes per second. Since we're running 1000 iterations, this is 100 000 PoW operations per second. This means that posting 100 spam messages will take you 20 seconds (WolframAlpha: (integrate -7/ln(65536 / (65536 + x)) from 1 to 100) / 100000
).
This may sounds like good result, but it isn't. If we want to generate 1000 spam messages, that becomes 32 seconds, not 200 seconds! (FYI: 1 000 000 messages = 88 seconds).
Uh-oh.
Let's run 100 000 iterations maybe? This means you get 1000 PoW ops/sec. This leads to half an hour to generate 100 messages, and 2.5 hours for 1 million messages.
The "1 million messages" case was calculated assuming that you can take captcha from, say, the very first post (i.e. there's no "hash from last 65536 posts" condition). If we assume that we have that condition, things get worse. (integrate -7/ln(1 - min(1, x / 65536)) from 1 to 1000000) / 1000
gives us 1.3 hours to make 1 million spam posts.
Er... What about 100 000 000 iterations? That gives us 1 PoW operation per second. So 23 days to post 100 messages... And 1 second to post with captcha! Seems like a good solution to me.
So, what I recommend is:
Run 100 000 000 iterations of SHA-512 instead of one iteration of SHA-256 and you'll be fine.
that's interesting i never know that's a possible way to attack, trying to get your own chaptas
i mean it as in, what if we assume everyone in the image board is dishonest and post fake unsolvable chaptas? or like people posting plain easily readable captcha
Some other considerations and questions about the idea, that by the way is really interesting since it seems to add far greater inequality (1:1000000) between real people and spammer by using something that humans can do and machines almost can't.
The CAPTCHAs should be lenghty enought and have salt so that a simple bruteforce is unviable (without salt and CAPTCHAs length <= 10 it should be possible to solve all of them [since they are unsalted] in about a week with a GPU)
Actually... I just found a bug in my calculations. It becomes 30 seconds to post with captcha on my computer (and it's rather fast). So maybe it would be a better solution to run 10 000 000 iterations of SHA-512. So this makes it 3 seconds to post on my machine, and about 2 days for attacker. Still good, though.
to 1 seems not ideal: as unless we have some sort of centralization that force users to post in certain difficulty that wont do it
what if the solution is hash(solution + publicly calculated value) but then to verify it we would have a problem as we need human to manually enter the solution assuming some 1 way thing that can proof it's correct without giving the solution away is a good way
Here's zkSNARKs implementation for JavaScript: https://github.com/iden3/snarkjs.
@imachug I didn't know there were zksnarks for Javascript already available! In this way one could have trusted parties (external sites) specialized in captchas and with some fixed amount of them (like some millions).
Why bother
Right now we have 0 defenses against spamming, maybe this could be useful Also this is NOT meant to exclude the possible damage of forcing POW as some users are mobile based and will struggle and lack the capability to legitimately post while the spammer on a mid tier rig can easily blast through POW This is mostly a barrier of sort, not meant to be the only line of defense against spamming but to be used in combination with other tools..
(FYI: i used the word POW signing to refer to computing a hash that matches requirement) Possible examples:
POW each publish
This is a simple system where each time you want to publish the data also need to be POW backed to publish, actually it dosent stop spammer, they could choose to spam everything in one go.
POW for file size
There should be a "POW sign" this to unlock XYZ more space EDIT: also there should be ways to have diffrent POW requirements, between certs providers and anonymous certs #1812 Example in contents.json for users:
In order to even do your first post you need a 4 zero hash check(explained by the 0 byte). Each time your data size reaches a new "milestone" you also need to create POW with matching regex. The *1gb means each time you reach a new multiples of 1 gb you also must sign it, say you reached 3 gb, then you must sign it 3 times.
for example in the user's content.json
the hash match function would be something like hash(auth address+size+site address+data) size and site address may be optionally excluded by owner's will, so that a hash can be signed once and used at elsewhere, and allow users to precompute huge hashes only based on their auth address
Other ways?
Only other ways i can think of is signing for every individual "entity" which depends on application, like zerome and zerotalk each have their own individual entity It would sure be nice to be able to make each entity to require signing but i cant see a way without huge complication in the json structure of each file Yes it's possible to do it front ended rather then at the network level, sure it hides the post but still dosent negate the fact people can spam and waste other's disk space