Tribler / tribler

Privacy enhanced BitTorrent client with P2P content discovery
https://www.tribler.org
GNU General Public License v3.0
4.87k stars 451 forks source link

Question regarding the security of the "micro-economy" of bandwidth tokens. #5430

Closed ibudisteanu closed 4 years ago

ibudisteanu commented 4 years ago

Hi! I have one big question regarding the security of the proposed micro economy of bandwidth tokens. Namely, about the trust system that can not detect forged bandwidth. I personally spent one hour looking into the many solutions proposed by the project's researchers over the years. Definitely there is content I have missed reading or didn't understand it properly. Private torrent trackers are well known vulnerable to fake ratio (clients reporting to trackers fake uploads). Because there are many proposals and versions of the Tribler's micro economy, I might mix some things up. The last thing I remember reading in the papers (and wiki) is that Tribler is seriously taking in consideration the TrustChain solution with digital signatures. The concept is great ( blockchain based ), but here is my question: a "hacker" creates two accounts (public/private keys): fake A account and fake B account. He then signs digitally that: 1) A sent to B 500 GB and 2) B signs that he received 500 GB from A. These two certificates are then later validated and stored in the TrustChain. How does your system detect this fraud ? I doubly believe anyone can envision a system that solves this without having a great amount of fraud. You can rely on the fact that Hacker's account A and B are new accounts with no much reputation/trust from others. You can only use a voting based system like the (Steemit Power) but it is well known that even Steemit is full of "botnets" that are just publishing and voting each other content's. From my understanding so far, the token economy is fake exactly like the ratio on private trackers. Just a number that anyone can fake it up.

Is there are any blockchain explorer for the TrustChain ?

Thanks. If I mixed up things, I would be more than happy reviewing your concept/model for building this micro economy of bandwidth tokens..

devos50 commented 4 years ago

Great question!

In general, it is non-trivial to detect forged bandwidth. TrustChain is designed as a simple and light-weight accounting system, where fraud is eventually detected if users exchange records with each other. Our distributed ledger is not capable of verifying whether bandwidth actually has been transferred between two users, e.g., by leveraging some kind of "proof-of-bandwidth". TorPath solves this issue by delegating circuit creation to a centralized server, however, one of our design principles is to avoid any dependency on centralized servers. I also see some potential in Proof-of-Space-like solutions, where one proves shared knowledge (e.g., two users are challenged and have to prove that they both possess some torrent). How such a system would work in an anonymous setting like Tribler is still an open research question.

The main idea of accounting bandwidth transfers is to eventually detect free-riders and refuse services to them (e.g., exit node or relay node traffic). Our envisioned solution is to leverage a reputation algorithm that determines the trustworthiness of individuals in the network. These algorithms are capable of detecting disproportional large (fake) contributions and adjust the ranking of users accordingly. Now, reputation is a multi-faceted and complex metric and there has been much research on building Sybil-resistant algorithms. Our goal is to build a practical reputation mechanism, ready for deployment in Tribler and efficient enough for adoption. We already made some great progress towards this goal, also see the discussion in this issue. Our prior work presents algorithms based on maximum-flow (accurate but slow) and personalized PageRank (less accurate but fast).

Another concern is the white-washing attack, where a peer simply deletes their key pair and re-joins the system under a new identity. Much research on peer-to-peer technology solves this issue through a stranger policy that places less trust in users that are new to the system. An adequate stranger policy should incentivize users to improve their current profile instead of creating new ones.

A TrustChain block explorer can be found here. At the time of writing, TrustChain has over 120 million records and I'm currently in the process of analyzing this data. Together with data collected from Tribler, it shows that Tribler is effective at preventing free-riding behaviour. Still, there are many open questions and targetted attacks. For example, one of our researchers is working full-time on a state-of-the-art accounting mechanism with far superior fraud detection times than TrustChain. You can track the progress of this system on this issue.

Thanks. If I mixed up things, I would be more than happy reviewing your concept/model for building this micro-economy of bandwidth tokens.

We can definitely talk more about this! All input is welcome. I am currently working towards a scientific publication that outlines the whole process of bandwidth accounting, trust computations and eventually refusal of services to free-riders 👍

synctext commented 4 years ago

fake A account and fake B account. He then signs digitally that: 1) A sent to B 500 GB and 2) B signs that he received 500 GB from A. These two certificates are then later validated and stored in the TrustChain. How does your system detect this fraud ? I doubly believe anyone can envision a system that solves this without having a great amount of fraud.

Very nice question. Basically this impossible to solve. New accounts starts with zero reputation and no trust at all in them. After a few weeks and numerous interactions, they slowly grow into a neighbourhood or community of peers and gain some trust. For 20 years scientists have speculated about such systems and Big Tech has build them. However, nobody has ever realised a generic approach.

Because there are many proposals and versions of the Tribler's micro economy, I might mix some things up.

Yeah :-) We've tried lots of things in the past 13 years and have the money to continue till September 2025. We hope to show that this stuff works one day; irrefutably with a million people or so.

You can only use a voting based system like the (Steemit Power) but it is well known that even Steemit is full of "botnets" that are just publishing and voting each other content's. From my understanding so far, the token economy is fake exactly like the ratio on private trackers.

Yes, the problem is as hard to eradicate as fake bookkeeping at Enron, Bernie Madoff, and Lehman Brothers. I believe digital botnets and corporate scandals represent the same fundamental underlying problem: what it the truth?

ibudisteanu commented 4 years ago

Hi! Thanks for the answer! Very kind from you to take your time to answer my security concern. I really appreciate it.

Here is the thing. Although I am an expert in Blockchain and File Sharing Protocols, it took me literally one hour after discovering your project and skimming through your papers to find out that your trust system can be defrauded by anyone creating fake bandwidth tokens from thin air. Your base assumption is that although there might be some bad apples in the system ( the ppl who will have free and unlimited tokens ) most people will behave correctly. If the project will grow and gain popularity, there will be free software tools available on the internet to create free unlimited bandwidth tokens with the click of a button.

Moreover, a more serious security implication is presented in your system due to the ability of allowing users to transfer the tokens between each other. This security implication does not exist in the private trackers that "track" users' upload ratio. Because the tokens can be transferred from one person to another, a hacker can create a free unlimited Web Faucet with free and unlimited bandwidth tokens to be claimed by anyone with the help of a click of a button.

Taking in consideration the above 2 security implications, your bandwidth tokens can never have a REAL VALUE (worth something in the physical or digital world). Simply because anyone will simply create unlimited number of tokens out of thin air and simply exchange it for any asset that is worth more than zero. I even described a hacker who can simply develop a free faucet to share unlimited tokens through his faucet.

Third security concern from creating free bandwidth tokens out of thin air is spamming your TrustChain with unlimited number of transactions flooding your blockchain nodes that are proposing nodes. The attacker can create infinite number of transactions generating new accounts for every single transaction.

You said that a better trust is being assured by your own Onion Routing Protocol (btw congrats on implementation of your own onion routing). Can't the user simply disable the Onion Routing by selecting to download directly peer to peer betwen hacker's account A and hacker's account B ?

Even if you guys implemented this TorPath and require all users to use Onion nodes to improve the trust of tokens creation, the security concern still exists for the token creation. According to the paper Colluding clients and attackers needs to control all four components of a circuit to mine TorCoins fraudulently Even if an adversary controls up to half the network, only 1/2 4 = 1/16 of assigned circuits will be fully colluding. But what if the hacker chooses only 1 hop in the circuit and ask the tor master to create builds circuits until the attacker computer is connected to its own onion relay. Once the attacker is connected to its own relay, then the attacker can create unlimited tokens from thin air.

LE: I can't access the TrustChain explorer. Can you write the link again ? I am more than curious to check it out!

LE2: Although TorPath is centralized, I love it! Thank you so much for pointing it out! I really love the concept! Did you guys implemented TorPath ?

synctext commented 4 years ago

Very helpful! Currently thinking of creating such a "continuous attacker" and relentless 1 Gbps spammer on our production network. We have some scripts for that, but should finally just start to do chaos engineering. Correct, tokens will never have real value unless people trust the reputation system, ecosystem, (self-)governance layer and life-long strong identities are real: self-sovereign based hopefully. Addressing identity fraud would partly address the Sybil attack I think.

ibudisteanu commented 4 years ago

@synctext Nobody can trust a token with value that in just a few clicks you create a great number of them free of charge...

synctext commented 4 years ago

@synctext Nobody can trust a token with value that in just a few clicks you create a great number of them free of charge...

Fully agree. It's hard. We published a solution for this problem in 2010. We can switch away from Bittorrent and move to full encryption with random witness selection. The transfer needs to actually happen when we add to this 2010 idea a challenge/response extension. Witness needs proof that both parties have the (huge) block stored. Do you see that work security-wise?

ibudisteanu commented 4 years ago

@synctext after fast skimming through your proposed work I believe this "micro economy" of bandwidth currency can't never achieved. Your proposed solution in 2010 have the following security concerns:

  1. Centralized server which verifies the proofs. People have to trust the centralized server that verifies the proofs.
  2. Network congestion. Your centralized servers have to verifies a considerable % of proofs. You can't achieve this because you will receive billions of proofs per second and this number increases if the popularity of your protocol grows.
  3. You can't stop the hacker. You can only slow him down. Specifically the hacker can create free tokens by creating unlimited number of accounts uploading only a couple of fake packets. After a few fake packets had been approved because the server can't validate all the packets, the hacker creates two new identities and transfers the tokens.

LE: Also this doesn't solve the possibility when the attacker have both the sender and receiver accounts.

synctext commented 4 years ago

Thank you @ibudisteanu for reading all our work and doing such a quick analysis! (not even the students in our lab do that).. You are correct, our work can be attacked, but I believe it's fixable.

Centralized server which verifies the proofs.

All our work is uncompromising decentralised. Here is a copy of Trustchain design, everybody should be able to validate their value. We use a consensus-free and leaderless approach.

But that Trustchain article is outdated, it requires this new reputation approach with mathematical proofs (e.g. unreadable for non-specialists and gives me also a headache). You are certainly confronting us how bad our documentation is. We pushed out 75+ master thesis reports and 100+ scientific articles over the years. None of them tell the story in a clean and easy way. Oyeah, as our thinking evolved it also lacks internal consistency.

ibudisteanu commented 4 years ago

Here is the thing. I am not so sure if you understood my point. The decentralization of Bitcoin is not due to the fact that "Everybody should be able to validate their value", but due the fact that anybody who solves the pow challenge can propose new blocks in the blockchain setting the flow of the transactions. Any decentralized consensus algorithm has to solve the Byzantine generals problem. I don't understand how 2017 Trustchain decentralized gossip protocol proposed by your team solves BFT problem. This is the reason why IOTA uses the coordinator.

What is the 2017 Trustchain DAG/Blockchain size? Do peers have to download the previous blockchain history or there is a snapshoting of account balances (like in IOTA)? How is the state trusted without downloading the entire blockchain history ? Who is making these snapshots ?

In the paper it is not clear (vaguely described imo) how the consensus is achieved between the nodes. Namely, how does the honest network decides which block is the real block in case of a fork or a double spending attack ( an attacker proposes two distinct fork chains/blocks). Some nodes initially consider these blocks, while others consider other blocks*.

LE: I am just trying to figure it out what you are trying to achieve and if this thing is achievable. As I stated, it looks you are trying to solve something that can't be solved.

synctext commented 4 years ago

I don't understand how 2017 Trustchain decentralized gossip protocol proposed by your team solves BFT problem.

Hopefully I've understood your reasoning now. That is the magic sauce, we use a consensus-free approach. Some people now call this "extreme sharding" (also in Blockchain/DLT world). This avoids performance constraining BFT matters. We don't have any BFT or consensus at all. With 2017 Trustchain (nice name btw) each transaction is a single block, the microblock approach. No need to make snapshots. Everybody has their own (limited) personal view. Due to the unbounded growth of Bitcoin/Trustchain type datastructures we have done "pruning" research.

Perhaps helpful is a paper by Harvard, Berkeley and Delft for generic non-IOTA non-BFT distributed accounting without fraud.

ibudisteanu commented 4 years ago

There must be a catch that I don't get it. I feel a little bit confused by the materials you are providing. Can you point clearly the Paper/Documentation that Tribler is using to create this trust-less fraud-free distributed accounting system ? When I started this issue, you confirmed that the Tribler vision is to become a fraud-free trust-less distributed accounting (bandwidth token) system in the future. It means right now it is not fraud-free.

If 2010 work "Work Accounting Mechanisms: Theory and Practice" creates a consensus in a trust-less system for creating a distributed account without fraud then it is game over for all blockchain based systems (because blockchains are slow, pruning is hard, requires full sync, SPV is not great, POW not necessary and useless, block generation time is reduced, classic blockchain is Not very scalable, sharding etc.). Where is the catch ?

synctext commented 4 years ago

It is truly a pleasure to be having this conversation! I'm doing my utmost best to "unconfuse" matters for you. The problem is that our "trust-less fraud-free distributed accounting system" is fully operational, but still has known attacks for which we are aware of prior published scientific literature to plug the holes. Trustchain is simple in principle, just a DAG. But maddeningly complex when you put all the attack prevention measures in place, pruning algorithms for real-world efficiency, and concurrency mechanisms (participants have multiple outstanding signature requests) that only the best-skilled hacker get error-free implemented.

it is game over for all blockchain based systems (because blockchains are slow, pruning is hard, requires full sync, SPV is not great, POW not necessary and useless, block generation time is reduced, classic blockchain is Not very scalable, sharding etc.). Where is the catch ?

Historical insight, you formulated that nicely! Like Wikipedia before us, how do you prove your leaderless emergent approach is superior? Like Wikipedia, it only works in practice, not in theory. Blockchains will always be the first pioneers, ledgers are nice, but leaderless DAG-based systems might one day deliver :1st_place_medal: We have been working on this stuff in pre-Bitcoin days, progress is slow. The brilliant thing that Bitcoin did is merging integrity with consistency, a chain datastructure does make the whole thing a lot simpler. But it is an evolutionary dead end, indeed "game over".

It means right now it is not fraud-free.

Correct. We're incrementally improving our code, while still holding on to the Bittorrent protocol fully, and partially the Tor messaging protocol for onion routing. Each of our tokens is unique, just like you buy a car. You only have solid proof of quality after 20 years of driving it yourself, Nobel prize economics stuff. The magic secret we don't discuss in this line of reasoning is that "each token is its own micro currency". Coin mining can be secured with standard cryptographic measures and random appointed witnesses. These witness signatures and distributed protocol ensure that actual "coin minting" has taken place, info is public, and is cheap to verify. Is that making any sense?

The reason why we don't prioritise fraud-free coin minting, is because solving the tragedy of the commons is the easier Nobel-prize level problem to crack. The emergence of cooperation of a human collective of any size. Some "leakage" is allowed there. Only if you have solved that, should one dare to attempt to re-invent money.

ibudisteanu commented 4 years ago

Thanks for the explanation. I will let it be. Because we have common interests, I will keep an eye on Tribler

synctext commented 4 years ago

Security lessons learned. Note for the future: