Open grimadas opened 5 years ago
You can also see some double spend experiment with a network knowledge for the Trustchain and Tribler: Issue #4634 It works really good in practice. Now we need to generalize it and proof it works for other networks.
Draft for architectural diagram: Description: TBA
I get the overall idea (which is nice!). I might have some comments to increase the presentation and clarity of your diagram:
@qstokkink's work is basically the "Peer Discovery" module in @grimadas's architectural diagram above. With this module, we want to be sure that peers get to know as many as possible number of peers. We still need to make tests with larger graphs.
Next sprint will be on the next component of network layer, which we name as "fraud resilient overlay." Basically, this component will enable the nodes to build their 1-hop neighborhoods in the overlay network. The idea is to choose neighbors as diverse as possible with respect to the latency measurements, with the aim of decreasing the probability of a sybil to appear in one's neighborhood (see details here).
The graph below is a simple representation of how much we would decrease this probability. I used this latency dataset with 490 nodes and assumed that one of the peers creates 50 sybils around itself. On the x-axis are the nodes. On y-axis, we see the probability of the respective node to have a sybil as its first neighbor. Blue line indicate the random selection of nodes, orange lines show our strategy.
Next sprint: Building overlay network
First results of the overlay network, without diversity function F, for 64 nodes. The ping time is artificially delayed by the absolute distance in node ids. After +-70 seconds you know most of the network. After +-150 seconds all 64 peers have at least 14 diverse peers and have matched with at least 5 peers.
Initial findings:
The following graphs show the peers (amount of open connections), ancestors (amount of peers included in diverse introduction chains) and partners (the amount of matching peers using the greedy b-matching).
After scaling up to 300 nodes, we get more partners using the b-matching protocol. Further finding:
This is with the same settings as the last post. Now with F function implemented (we use kernel density estimation to approach a given distribution of ping delays - 1/x
in this example).
By periodically removing peers that have not been included in our latency edges, we get a more stable set of peers to select for matching (and a higher number for that matter).
Without cleanup:
With cleanup:
As our matching algorithm is stable, peers do not get dropped when a more favorable peer is available. Especially in smaller networks, this leads to some peers having very few partners:
By properly pruning the matching partners we get better results from the stable b-matching algorithm. Most of the nasty side effects (in terms of messaging) of periodic cleanup seem to be patched up by having a larger number of maximum matches.
Up to 30 matching partners:
Up to 60 matching partners:
These results are for a network of 300 peers, where every peer has 180 open connections (of which roughly 120 are considered to have a unique latency).
We are building an optimistic P2P for enabling decentralised applications that values scalability and decentralisation the most. Being optimistic we need to handle occasional frauds in the system.
To handle fraud we need reliable or at least predictable overlay. If malicious node cannot hide certain transaction we can detect fraud.
We want to build an overlay that will work in highly dynamic environment, for that we need to have following main modules:
Nice insides from the databases:
Many systems assume full membership and/or full network knowledge. This is not realistic in a dynamic network. We opt for a partial view of the network. Such partial view must be not biased and work under churn.
One of the most interesting approaches is based on gossip:
SWIM: membership -a little write up on early proposals in the filed: based on suspicion. I don't think it will work in our case though.
SCAMP: Peer-to-peer lightweight membership service for large-scale group communication
Ofc, these protocols will not work if you have unlimited Sybils that can poison your local view and isolate you from the network. A combination of Sybil mitigation strategy and Brahms is definitely a good idea.
Membership protocol is typically used to get connected and make it hard to isolate a node or partition the network. But they never argue how to select partner for long-term communication that would be optimal for you.
We can also select peers based on the capacity(declared bandwidth, or declared collateral balance for being a money router). We might then optimise the delivery of messages to a specific peer and build the system around that peer. Let's look at the concept of being interested in a peer as a topic in a PubSub System. Periodically peer declares a interest in peers as a vector of topics(or vector of trust). The overlay network organises itself into a pubsub system that gives guarantees on convergence and risk. For example, if you see that peer has to many subscribers with a low capacity you might prefer some other peer.
For inspiration, you might look into:
Some of the gossip protocols require a verifiable neighbourhood, that is either build based on hashes or is always limited. One of the main downsides of gossip is freeriding, which in our case is Byzantine behaviour (peer not forwarding information to hide certain transactions). More on that in accountability section
If we say that information forwarding is crucial we might want to track if everybody is following the protocol and incentives good behaviour.
One of the earliest practical proposals is
Lifting: Freerider tracking in Gossip
Acting: Accurate freerider tracking in gossip, 2014
Duck attack on acountability systems, 2017
-Making P2P Accountable without Losing Privacy
We say that resolution policy is application specific and we aim for deterministic resolution in P2P. There are three ways to resolve disputes/frauds. For simplicity we look for resolving two conflict:
We don't need to solve distinct identities and one man one identity, a group distinctness is enough:
k
beacons your identity is this vector The problem with these approaches are beacons that are central point of the system. You can make them fault tolerant but it is still a question how to choose them.
Another line of work that can help to detect sybils is beacons and is based on virtual coordinates. You try to map with an error Internet delays into a Euclidean space.
Most famous one is:
Another interesting papers is based on rings, which are similar to our buckets: Meridian: a lightweight network location service without virtual coordinates, 2005
In short this is an area with established related work, we should be careful not to miss them.
We can also select peers based on the capacity(declared bandwidth, or declared collateral balance for being a money router).
There should be no need to depend on declared bandwidth, as we partner based on latency. The less bandwidth you have available, the larger the RTT will be as you transact more in the system. This has already been widely researched in the congestion control sphere. We can look into something like BBR: Congestion-Based Congestion Control.
We can also select peers based on the capacity(declared bandwidth, or declared collateral balance for being a money router).
There should be no need to depend on declared bandwidth, as we partner based on latency. The less bandwidth you have available, the larger the RTT will be as you transact more in the system. This has already been widely researched in the congestion control sphere. We can look into something like BBR: Congestion-Based Congestion Control.
I think we should distinguish at least three types of peers:
Even in case of Tribler the peers are not homogenous, there are different roles(exit node, relay node, etc.). I think that fundamentally changes how you build overlay.
There is a bunch of related work on accountability and reliable and scalable gossip/multicast.
One of the key components to that is the Verifiable Neighborhood, which assigns a random members for interaction and verification. For example, in PeerReview witnesses and auditors are based on consistent hashing. In DHT your neighbors are chosen by the proximity of their hash.
Such systems can tolerant churn and malicious users, but there is always an upper-bound to it.
This bound can be trivially broken with a Sybil attack, and further your local view will be poisoned which breaks all security/risk assumptions.
We are now working on a Sybil Resilient Verifiable Neighborhood.
Everybody follows the diversity protocol, which together with random sampling(like Brahms) gives some protection from Eclipse Attacks and network partitioning in a presence of Sybils in static environment. This is enough to use gossip information dissemination and discover peers that we want to interact with, but this is not enough to give risk estimation.
To calculate the risk of getting a invalid token(double-spent) or the the risk of not getting payed(outdated balance estimation) and finally time to detection(t-visibility
) we need to know current and last interaction partners. Interaction partner is a peer that you want to be accountable, you want to validate their balance, calculate a function based on the history of the peer etc.
In other words, we want two answer three questions:
To answer all three question and make a practical system is hard task, if even possible. For us it is good enough to answer all of them probabilistically, e.g. instead of hard staleness we can go with eventually complete knowledge(k-staleness), but in this case it is very important to bound the dynamics of the partner set, i.e. number of peers you can add/remove at certain time.
Suppose we have a function:
Distinct({P_1...P_n}, k) --> {0, 1}
It checks if the set of peers {P_1 .. P_n}
contains at least k
distinct identities, where k <= n
and outputs 0
if there are less than k
, 1
if there are at least k
. This is function can be implemented using several approaches:
To answer these questions we have two directions:
Direct agreement among partners
Peer A
) discover enough peers and form a set of peers S
that you prefer to be you partners. You send S
to all of your partners from S
and collect signatures. If peer signs S
that means he confirms that he knows all you partners. peer A
is a leader. Distinct(S, 2)
It gives clear bounds and determinism on which we can calculate further transaction risks. It is only used when there is update on the partner set. For the stable partnership with will be used only once. This approach has established research and would be easy to argue about. It is not clear how it would work in a dynamic setting, more testing and discussion is needed.
Virtual agreement among partners
This is more elegant approach that reuses Trustchain ideas, but it is not clear how quickly will converge to a stable set and how to handle forks. Futher it requires synchronization of the chain and redundant interactions between peers.
I think we should distinguish at least three types of peers:
1. The peers we choose as a communication mediators. You will use them for forwarding information mainly. They are important to be connected(protection from partitioning, see Membership protocol) and we want to verify if they are sybils or not. 2. Peers that we want to verify if they have a certain balance, because we want to receive a payment from them in exchange for a service from us. 3. Peers that provide a certain service to us and we will pay them. They want to have verify me.
Even in case of Tribler the peers are not homogenous, there are different roles(exit node, relay node, etc.). I think that fundamentally changes how you build overlay.
I agree. What I am arguing is that there is no need to advertise specifically bandwidth. By partnering based on latency, we already have the emergent effect of partnering with peers that have sufficient bandwidth.
Regarding the partering agreements:
This set and signatures from the peers is enough to prove to anyone that you have a certain peer set.
I would reword this, to say that you can prove to anyone that you have at least a certain peer set.
Furthermore, it seems like the resource class of Distinct
function is very expensive and measuring most of the mentioned solutions would also interfere with the Distinct
functions of others.
Please also elaborate on how having distinct identities provides sybil-proofness.
This set and signatures from the peers is enough to prove to anyone that you have a certain peer set.
I would reword this, to say that you can prove to anyone that you have at least a certain peer set.
We need a lock on the interaction partners, a certain commitment that peer will interact only with them for certain time/rounds. That will be used to calculate risks, no interaction is possible before providing this commitment. Every peer should eventually know about your commitment, so we can use gossip for that. This will also have guarantees on detection if done properly.
Furthermore, it seems like the resource class of
Distinct
function is very expensive and measuring most of the mentioned solutions would also interfere with theDistinct
functions of others.Please also elaborate on how having distinct identities provides sybil-proofness.
It is an attempt to generalize the Sybil attack in our context. In case of the consensus there is direct way to calculate impact: if you can break 51 % with sybils you are the king.
In our case one of the threats is Eclipse attacking by sybils. Suppose after sampling you get a set of peers, now you want to test if this set is unique enough. You require to estimate that at least k of n
are unique. For example, 2 of n is enough to not be eclipse attacked, but probably not enough to build healthy overlay? For that you can test latency and diversity, or require do resource testing(computational challenge, storage challenge etc., bandwidth exchange). We should argue why we choose latency and diversity instead of resource testing. Both approaches are only locally verifiable btw, making it globally verifiable is not trivial.
We need a lock on the interaction partners, a certain commitment that peer will interact only with them for certain time/rounds.
This is a crucial bit of information that was lacking in the original explanation. I can see that working, agreed.
test if this set is unique enough
I think we need a stricter definition of what a sybil identity is then. It is pointless to argue which measure we use for sybil identities, if we do not have this defined from the start. In the case of Bitcoin we can argue that its proof-of-work is very bad at avoiding sybils, as the rational actors are forming mining pools and have to artificially be kept under the magical 51% mark. The only thing protecting against one of these attacks in Bitcoin is the fact that the 51% attack would also devalue the coin itself. Back to the definition part: where do we draw the sybil line in Bitcoin? Is a Bitcoin sybil a single machine with a lot of CPU and multiple addresses? Is a Bitcoin sybil a factory hall filled with asics? Is a Bitcoin sybil the entire mining pool?
Both approaches are only locally verifiable btw, making it globally verifiable is not trivial.
I will challenge the fact that it is even possible to do global verification of sybils without a single trusted party in control of all identities.
I think we need a stricter definition of what a sybil identity is then. It is pointless to argue which measure we use for sybil identities, if we do not have this defined from the start. In the case of Bitcoin we can argue that its proof-of-work is very bad at avoiding sybils, as the rational actors are forming mining pools and have to artificially be kept under the magical 51% mark. The only thing protecting against one of these attacks in Bitcoin is the fact that the 51% attack would also devalue the coin itself. Back to the definition part: where do we draw the sybil line in Bitcoin? Is a Bitcoin sybil a single machine with a lot of CPU and multiple addresses? Is a Bitcoin sybil a factory hall filled with asics? Is a Bitcoin sybil the entire mining pool?
Totally agree with you that it is tricky with mining pools. What you essentially have you have multiple users colluding to contribute to one address, should they be considered Sybils or just colluding?
But let's get back to our case. Our goal is to pick overlay neighbours that are good enough and after pick interaction partners that are good enough. Goodness on the overlay is the based on forwarding information and not eclipse attacking you. Goodness on interaction partner is based on ??? Capacity? Balance? Trust? Reputation? Depending on what we choose the Sybil attack will change too
Goodness on interaction partner is based on ??? Capacity? Balance? Trust? Reputation? Depending on what we choose the Sybil attack will change too
Perhaps we can generalize this as the chance of your interaction partner finding any particular transaction you were involved in. Balance, trust and reputation are all derived from an (almost) complete set of transactions.
For the next iteration we will focus on hardening the guarantees, giving a quantitative bounds on finality.
Goals
The main goal is build an accounting system which is simple yet powerful to support decentralised applications on top of it, such as financial markets, reputation systems etc.
Non functional requirements:
Means:
Science behind it:
Optimizations: TBA
Noodle == much more then double-spending and witness lists. That was new to me. 2 years of work have been invested in the channels (or main outcome of 2 years of re-factoring). Switching to The Noodle Stack (with 5 layers it's more then a protocol) is quite disruptive and took me by surprise as you obviously noticed (more discussion on 10June).
Set Reconciliation is a step towards the complexity of Dispersy with Bloom filter, something we which to avoid. For a lot of use-cases (e.g. channels) you can assume there are sufficient "seeders" with 100% of the information and you can simply collect it. Everybody is required to "seed" their complete trustchain record history. Simple linear list. Can you support that easily?
Channels only need linear downloads of bulk Magnet links plus filename(s). Therefore, for channels a single append-only linear list with logical deletes is sufficient. Current version requires a channel to be completely downloaded before records can be inserted. Partial downloads of subdirectories of interest are not supported. If we enhance the channel format to be downloaded in real-time, inserted and seeded partial we solved sufficient problems and can move on. For the roadmap and Kotlin implementation we need first documentation of the current channels format and commit structure.
Questions to discuss with a few coffees; Do you envision The Noodle Stack support the bulk exchange of 1+ million magnet links with congestion control? How does this work relate to #3690, trustworthy gossip? How do you solve the problem that most witnesses will be offline 98% of the time (unfiltered)? Witness reliability filtering is probably a requirement, but also an attack vector?
Set Reconciliation is a step towards the complexity of Dispersy with Bloom filter, something we which to avoid. For a lot of use-cases (e.g. channels) you can assume there are sufficient "seeders" with 100% of the information and you can simply collect it. Everybody is required to "seed" their complete trustchain record history. Simple linear list. Can you support that easily?
I agree. I kept this part as simple as possible. If you have an order the reconciliation is much easier and Merkle Trees and Bloom Filters can be avoided. However, linear order specially a global one is not always possible or even desirable, for these cases we use reconciliation on a DAG. With noodle we enforce linear order on one writer-chain (personal chains - classical Trustchain). Community chains (multiple writers single datapoint) are optimistically linking to converge towards a linear log, but nothing bad will happed if it is a DAG.
Questions to discuss with a few coffees; Do you envision The Noodle Stack support the bulk exchange of 1+ million magnet links with congestion control?
That would be definitely interesting to try.
How does this work relate to #3690, trustworthy gossip?
Gossip is the heart and soul of Noodle. We just narrow down the gossip to only those who care (related to PubSub). In current implementation I use push-pull gossip with reconciliation to guarantee fast convergence. Partly tested with Wikipedia dataset, can converge within 15-20 seconds under heavy load for 1000 peers.
How do you solve the problem that most witnesses will be offline 98% of the time (unfiltered)? Witness reliability filtering is probably a requirement, but also an attack vector?
Witnesses are not required for the protocol to work (Guaranteed Eventual Detection). They are required to build critical applications where detection (+ grim trigger) might not be enough. Witnesses can be chosen as you like, but the best approach so far - choose peers in the community and make sure they are diverse enough. Witness audit is enforced by a counter-party and their risk assessment, so in the end, they will decide to accept/decline the transaction.
Witness availability is not a big issue since the witness set can be expanded up to everyone in the community. However, every community need to make sure that there is at least some number of available peers in the community. But this a research topic on its own, which I'll address later.
Noodle is moving to separate repository to easier management. For now it under it is under my personal GitHub, I suggest to move it later to Tribler repo, and include it as a reference in main Tribler repo. Noodle builds on top of IPv8, with IPv8 as a requirement.
Noodle will be continuously tested and improved. The protocol will be documented and published as a pypi package.
If successful Noodle will join the pleiad of similar protocols: DAT, IPFS, Scuttlebutt.
Just for my understanding I'm translating the above into more abstract terms and MvP requirements for an accounting system which can underpin our live token economy.
BitTorrent is great tool, but it has narrow applicability, i.e. BitTorrent governance is based on Tit-for-Tat and is restricted to very small world - one swarm.
Yes, using Bittorrent for bulk downloading of gossip information seems like a hack. It actually is a hack. :warning: rant warning... with trustchain we abuse Bittorrent as a multi-sourced FTP service. In the history of our lab (since 1999) many master students, developers, post-docs or phd student believed that they could do better then Bittorrent or improve it. Everybody failed after trying for upto 4 years. Their designs are more elegant and superior. Their implementations never matched. They always added more complexity then a one-person project can handle. Serious people stopped trying to redo the Linux kernel or the leading compiler. IPFS and DAT people are finding out the hard way, Bittorrent is hard to improve upon. It can even support Tor-like privacy, if you beat it up for 5+ years. Engineers are trained to hate such hacks. Lesson: after our costly https://libswift.org experience, do not try to be superior to Arvid his brilliant code. I tried and it has cost me 650.000 Euro of EU money.
Please read about this gossip work done by our master student who was hired by Mozilla afterwards.. "Splash: Data synchronization in unmanaged, untrusted peer-to-peer networks". Do you want to make it as simple as such "sync points"? What do you think of their idea of using random gossip for "live mode" and Bittorrent for "history mode"?
Well I agree that is really hard to match Bittorrent can be used as multi-sourced FTP service, but BiTorrent is much more than that, than it raises a question why not use something different.
But it as it was designed as for bulk downloads the only way to one when use Gossip vs BitTorrent is do heavy experimenting. My hypothesis gossip sync should be sufficient for most of the cases. As clients generally don't want or even don't need to download big bulk data, we might consider something more lightweight. I hope experiments will reveal that.
On top of the trust layer there is perhaps the most interesting part of the application layer: the allocation of resources, preferential treatment, punishment of freeriders, community exclusion of cheaters. Combined they could be a new layer in your model: management of the commons.
Wikipedia, Linux kernel, Bitcoin, Bittorrent, and residential WiFi sharing somewhat manages to manage the commons, with limitations or with remaining central server. Scientific target: relentless automation and zero-server principle. Organisational principles:
Relaying requires incentives and seeding requires storage and bandwidth. (open issue for 7 years to donate 1 TByte to the community #21). Open issues lossy accounting, overhead of accounting, risk exposure, double spending risk, settlement time, economics of anonymity.
proposed publication plan: always have 1 paper under review. Re-use OSDI/Usenix thinking. Design paper with early results of your accounting framework. Resource contention, 200k new trustchain records daily, exit node rules, preferential rules, scheduling algorithm. Paper by 21 September ?.. :clock10:
2022 insight: Do not trust strangers, only help them a bit to get going. SybilGuard and all these algorithms fail. Maintain stable collaborators in order to protect from unlimited identity creation. Critical failure to assume humans can detect fake identities from real humans. See 300 fake newspapers example.
Privacy with gossip: Spy vs. Spy: Rumor Source Obfuscation Real attacker do a Sybil attack (but still interesting related work).
@grimadas Novel idea: take Trustchain entanglement to the next level: link to any party and fixate their last block. Aug 1st goal of running code on DAS. Then solve Tribler problem of freeriding. You can be in 100 or 1000-ish counter-party groups which synchronise state. Towards the goal of knowing the contribution level to the commons of 1 million peers. Mission is to prevent the worst abuse and fraud (not exact accounting). Please document your design in 1 page by 14 July. Plus: https://dicg2020.github.io/
Within this article we present a scientific roadmap for addressing monopoly formation on The Internet and our economy at large by creating an infrastructure for the common good. We present results of our ecosystem we build over 15 years, inhabited by 1.7 million volunteers. Our work encompasses four isolated scientific fields. With our infrastructure for the commons good lens we unify the fields of applied social science, distributed algorithmic mechanism design, adversarial information retrieval, and traditional distributed systems. Our architecture contains five elements:
We claim our generic architecture is usable across business domains and a diverse set of applications. We focus the remainder of this work on the core ledger part. Our novel addition to existing body of work within ledger science is based on the strength of weak ties. We are the first to successfully apply the classical work by Granovetter from 1973 to dramatically improve the effectiveness of ledgers. We only add a single rule to a DAG-based ledger. Each economic actor randomly fixates the last observed state of another economic actor on its own ledger. The state of others is compressed into a single hash, resulting in a mere 32 bytes of storage cost. A special property emerges from adding random pointers in a graph with unknown properties: entanglement. Economic actors collectively ensnare each other.
We provide proof for the unreasonable effectiveness of weak ties for ledger science.
Would it be a stretch too far to talk about flexible networks without rigid boundaries, instead of ‘weak ties’? Could be a unifying argument about the power of permissionless systems.
A paper under work now, working title: "Universal Accounting of Goodness".
The main reason for accounting and unified rules is to address uncertainty and information asymmetry. Information reliability and relevance is important for the business decision making.
We translate these properties to distributed systems, namely - consistency and validity of information. We recognise the fundamental difference between these too and address the most important adversity: information hiding and information tampering
But we also recognize that achieving 100 % reliability is not always possible nor desirable. As information retrieval and storage is conjugated with spending resources. We model accounting platform that is flexible and universal to support accounting for multiple domains. It allows users to choose different policies based on they risk model and adjust their behaviour accordingly.
@grimadas Can you take 15min today and post an progress update of the past summer months + all code.
A paper under work now, working title: "Universal Accounting of Goodness". Why accounting? Insights from accounting theory The main reason for accounting and unified rules is to address uncertainty and information asymmetry.
The leading organisational principle of developed nations is the market mechanism with a certain level of government regulation. Open access (commons) is an alternative to private control, however it is academically underdeveloped in terms of theoretical support. The tragedy of the commons is the key obstacle to sustainable commons. We claim that private control by Big Tech companies is difficult to avoid, without a solution to the digital commons problem.
We introduce the first viable solution to digital commons problem, based on accounting of social capital. To govern the common collectively we use accounting and unified rules to address uncertainty and information asymmetry. We provide empirical data on our global database of social capital.
:drum: Intro Bami! (please push the tentative code)
1.Maintenance of cooperation:
Main application focus of BAMI is contribution accounting to sustain cooperation. The requirements were given above. With main one if the identification of bad peers (Byzantine cheaters and freeloaders).
I see several different ways to approach this application. Here are my thoughts:
Contribution - is a pairwise action from one peer to another that will be recorded as a transaction with two actions: contribution T1: peer A -> peer B
and confirmation (or explicit rejection) T2: peer B confirms (rejects)
. Every contribution requires two signatures (pairwise accounting), this approach addresses misreporting, i.e. peer slander, peer A cannot falsely claim that it contributed to peer B with B's consent.
Contribution event - double-signed entry.
We don't verify the event truthfulness.
As it is connected with verification and having global knowledge - more no than yes. Only in specific applications under certain assumptions (honest, altruistic majority, global consensus). The best we can dot to stay scalable with millions peers is to optimistically detect/estimate probability of freeloading.
We don't prevent freeloaders with 100 accuracy.
They can help in two ways: Ranking or Filtering. Depending on the application requirement one might be preferred over the other.
This allows us to achieve magical properties, but only if we view it from the perspective of an honest node:
Best candidates for these algorithms for now are: variations of personalised hitting time.
We use trust algorithms to limit the effects of attacks
No. This is highly personalised, as the only trust anchor that peer has are 1-hop interactions. Peer can either verify all 1-hop peers directly, or it trusts them. (This is our assumption!).
This leads to an interesting result - some peers might consider honest peer a freeloader, and vice versa. However we show that with enough interactions work graph and trust algorithms lead to evolutionary stable state - freeloaders starving.
Trust function is personalised. Global ranking or filtering emerges with time.
Peers can still hide information and cheat by presenting only partial information to bias the trust algorithms. This attack is related to forking, network partitioning.
To protect from such attack peers need to exchange data on the target peer. This can be achieved via gossip, periodical checkpointing and witnesses.
To simplify and decrease time to detection we connect together peers by topic (interest in the target peer). This turns blind gossip - into a targeted one (similar to pub-sub).
We use targeted gossip to minimise time to detection
As the trust is personal honest peers might see each other temporarily as suspicions. However I believe that we should block them, but instead give a limited service depending on the severity. For example, give less priority, or decrease speed. Thus, any freeloader would eventually starve.
Punishment for freeloaders - service quality degradation.
Cheating is more severe - as it is a change in the protocol with an objective proof of cheating (justified). Either more degradation of the service, or temporary ban might be considered.
Cheaters get more severe punishment but only after justification.
There should be always an option to get back that is more preferred than whitewashing. For example, cheaters might fix all the forks and continue. Freeloaders will get the same quality once the ranking is increased.
Punishment are incentives for change.
t1: A -> B: 100 MB, B -> A: 0 MB
. Later counterparts decided to change their contribution relationship (total contributed): t2: A -> B: 50 MB, B -> A: 0 MB
. The total went down, do we consider it as valid behaviour or not? Should it punished? A
for every service/app he is in), or app-centric(gossip about app-chain: shared chain of peers that run specific service/app), hybrid (chain of the peer in specific app). The comments and more question are more than welcome!
We don't verify the event truthfulness.
Excellent thoughts! Please expand this into 6-pages IEEE before 21September.
Storyline: "Facilitating the emergence of truthful accounting". Our remarkably simple accounting mechanism does not record any details of interactions within a community, yet effectively suppresses freeriding, cheating and fraud. Our only mechanism is to irrefutably record the existence of an interaction between two community members. Paradoxically the mere recording of this interaction together with higher level mechanism lead to the emergence of global cooperation and starvation of freeriders. We successfully devised rules with the complexity of tit-for-tat with...
With one addition we can make our mechanism resilient against fake identities within a complete collaborative framework. New identities start with zero community standing and only receive service if they have made a net positive community contribution. Further details are outside the scope of this work:-)
The Truth Convergence Paradox. Mechanism to establish "single event truthfulness" do not converge to fraud-free communities as well as mechanisms which ignore truthfulness at this level and operate only using partial information, random gossip, and optimistic probabilistic detection.
Related work: trying to enforce single-event truthfulness at the cost of efficiency (destroys 300Mbps+ download capability of Libtorrent) "Treat-before-trick: Free-riding prevention for bittorrent-like peer-to-peer networks"
Bami Blocks have been born! Nice work :1st_place_medal:
Management concern: when the phd work is finished the outcome is either: 1) self-maintaining Github community 2) dead code in Tribler. Please only make a modest change to Trustchain, keep improving and expanding it as much as your like. But keep the first version of Bami integrated in Tribler clean of any non-Tribler features. Proposal: by 15 November there is a stable Tribler build on our Forum with Bami 1.0 ready. Time-boxed approach. List of potential issues to address (brainstorm):
Candidates for later improvement cycles brainstorm, after Bami 1.0 release:
After hard thinking and planning, I decided to follow the most straightforward yet powerful approach there is:
In total peer will store only K
block, where K is the number of counterparty peers. Even if you craw everybody you only store |E|
blocks, where |E| is the number of interaction pairs between users (number of edges in an undirected graph).
Latest experiments and research showed a surprising result - a reputation mechanism if mode properly to provide a substantial level of robustness, even in the presence of Sybils and fake data. Hence we refer all security and robustness to a reputation leaving accounting very very simple.
This reflect the vision and is sufficient to build sharing economy with minimal complexity.
wow! Thank you for your Tribler-centric/lab-centric thinking. So with barebones accounting, latency diversity, peer ranking, competitive upload spots we get spam flooding protection and Sybil-resilience? We need more?
wow! Thank you for your Tribler-centric/lab-centric thinking. So with barebones accounting, latency diversity, peer ranking, competitive upload spots we get spam flooding protection and Sybil-resilience? We need more?
Yes, according to the latest research insights, it provides sufficiently solid security and allows us to build the first version of a micro-economy based on sharing 💪.
latency diversity
@grimadas Do you need this now? It's currently not part of IPv8's wild codebase (as the related research is not accepted anywhere yet). I can hook it in though.
wow! Thank you for your Tribler-centric/lab-centric thinking. So with barebones accounting, latency diversity, peer ranking, competitive upload spots we get spam flooding protection and Sybil-resilience? We need more?
I don't think that we need the latency diversity for the first (barebones) version of the bandwidth ecosystem. I advise to consider this as a task for next iterations.
I will also update our ongoing write-up with these key insights.
Alright, I'll leave that out then. It would probably also only serve to hinder you anyway: the DAS-5 running 1000 instances per node would be detected as a Sybil attack 😄
The barebones accounting is now deployed successfully, dubbed LedgerZero. Please remain focused on an upload incentive. First make it work. Try to suppress the reflex of generalisation, into a generic incentive mechanism. Nobody has ever created an indirect reciprocity mechanism that actually worked and scales. Tragedy of the commons for 1 resource is unsolved. Leave moderation processes and metadata for the next generation of phd students in the lab. It is guaranteed to delay phd graduation if you try to improve upon this 175 page work. Actually, its even a broader problem I believe. Moderation is hard. We currently do not have the resources to facilitate spam-free and harassment-free communication amongst our users. Free text flow (many-to-many two-way) is too difficult to "Internet police", our scientific focus is first self-governance of 'filename moderation' (suppression of spam,fakes,worms,broken files, poor quality files, duplicates, misspellings, and bundles that are not bundled).
As discussed. Was the focus on freeriding for 16 years smart? In a quality community, do we then induce unconditional giving? permaseeding is the term I would like to introduce for the desired behaviour. Permaseeding enhances social welfare, as opposed to hit-and-run.
Platforms engagement and sharing: https://www.emerald.com/insight/content/doi/10.1108/JSTP-04-2016-0071/full/pdf?title=engagement-platforms-in-the-sharing-economy-conceptual-foundations-and-research-directions
Freeriding is very unsolved and Web3 makes it very relevant. Accounting protocol is making progress. Perhaps we can get 2 chapters out of it. One: algorithm only; repeated prisoner dilemma N-person; solved with gossip. Other deploy and measure with LedgerZero.
Sybil attacks are now investigated by six isolated fields. They are not aware of others. They lack the understanding of how fundamental their problem is. They are not fully aware of all prior and related work
Two isolated field are probably much closer aligned than anybody realises. Unbelievable (really) similarities exist between theories of trust and theories of mind. With decades of scientific effort we might unify the Evolutionary connectionism architectures with our continued research into incentives and credit allocation (e.g. MeritRank) within collaboration humans (Our ongoing research for the past 23 years :astonished:). See Evolutionary Connectionism: Algorithmic Principles Underlying the Evolution of Biological Organisation in Evo-Devo, Evo-Eco and Evolutionary Transitions from 2016.
Prof Levin great writing from evolutionary biology: The collective intelligence of evolution and development. Published in the collective intelligence open access Journal he co-founded. Great quotes: Evolution is more intelligent than we realised
, the Tribler Lab experimentally determined that "Randomness and evolution are more intelligent than we realised". Its counter-intuitive for scientific engineers that you can't outsmart massive scaled randomness.
The relevant section: Section Credit assignment in individuals and collectives
Contains this quote: How does scaling of reward dynamics bind subunits into intelligent collectives that better navigate novel problem spaces? Lessons from machine learning. It is no accident that the issue of credit assignment, and the application of credit to parts or wholes, is a central one in evolutionary selection, developmental and organismic biology and cognitive science. It is a feature of many difficult learning tasks that they require sequences of actions that are many steps away from ultimate goals – making it intrinsically difficult to incentivise the component parts involved.
Reward functions appear in biological systems, our collaborative systems based on MeritRank which address freeriding, and also in a state-of-the-art robot recing pilot. See the Nature Journal quote with the reward terms for reinforcement learning with collision avoidance:
_update_: another [fascinating Nature paper studies credits](https://doi.org/10.1038/s41593-023-01514-1). These are micro-level credits, to update weights (see [overview paper](https://arxiv.org/pdf/2403.18929)). The [Github code](https://github.com/YuhangSong/Prospective-Configuration) for backprop alternative. ```For both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output, a challenge that is known as ‘credit assignment’. It has long been assumed that credit assignment is best solved by backpropagation, which is also the foundation of modern machine learning. ``` The next step is possible if we ever managed to formulate the above dream theory. Collectively we could develop superhuman capabilities. We "only" need to overcome fraud, greed, and tyranny. **Make our species more intelligent, productive, and resilient by advancing the theory and practice of trust, collaboration, and democracy.**Scientific Foundation for Offline Token:
Working title: Project Noodles
A draft for layers for the accountability tools for the next generation applications. This is semi-layered architecture draft for better understanding of the concept: