phd placeholder: Accountability Protocol for Future Decentralized Applications

Tribler / tribler

Privacy enhanced BitTorrent client with P2P content discovery

https://www.tribler.org

GNU General Public License v3.0

4.8k stars 444 forks source link

phd placeholder: Accountability Protocol for Future Decentralized Applications #4719

Open grimadas opened 5 years ago

grimadas commented 5 years ago

Working title: Project Noodles

A draft for layers for the accountability tools for the next generation applications. This is semi-layered architecture draft for better understanding of the concept:

Network layer:
- Network layer is for building a healthy overlay first and get connected to the giant component of the network;
- It is used for bootstrap and getting peer list with their latency;
- Possible strategies: Edge Walk, Random Walk;
- Using inverted birthday paradox one can also estimate the size of the network(total peers): Peer Counting and Sampling in Overlay Networks:Random Walk Methods
- Random walks can help avoiding Sybil Regions? (Issue #4481 )
Overlay layer:
- Selecting your initial overlay neighbors with some Sybil resiliency guarantees, based on the latency distribution (Telegram is doing that (section 3.3.10));
- B-Matching could be also used in this layer to optimize peer selection (More details are available here Can's write up);
- This layer will form a network of semi-stable overlay neighbors that will be used in higher layer.
Ledger layer:
- To guarantee tamper resistance we need to store knowledge about other agents and share our world knowledge with others;
- We need a way to quickly detect double-spends and recover from partitioning while ensuring availability and eventual consistency;
- The main goal of this layer is to provide quick way for set reconciliation between peers(Efficient way to synchronize and detect inconsistencies);
- This layer will provide tools for merge semantics, or how to recover from double spends, forks etc.
Trust layer(or Risk layer):
- Knowing the overlay and history of transactions from the ledger layer we can have a good estimation on the weighted network topology and information convergence rate;
- Each transaction from an agent can be assigned with a risk value: e.g. probability of double spending.
Application layer:
- We say that security level is application specific, and even agent specific. One can allow risk to lose 1 $, and some only 0.01 $;
- This layer will define the maximum risk and security level;
- Related: Risk measure and Value at risk

NoodleProtocol

grimadas commented 5 years ago

You can also see some double spend experiment with a network knowledge for the Trustchain and Tribler: Issue #4634 It works really good in practice. Now we need to generalize it and proof it works for other networks.

grimadas commented 5 years ago

Draft for architectural diagram: Description: TBA

NoodlesComponents(1)

devos50 commented 5 years ago

I get the overall idea (which is nice!). I might have some comments to increase the presentation and clarity of your diagram:

The use of layering is confusing: you put the application layer at the bottom and the network layer at the top. Moreover, the tamper-proof ledger is on the same height as the network layer.
tamper-proof ledger -> tamper-proof distributed ledger ?
It seems that the network layer passes transactions directly to the operational layer, but you also have peer messages passed to the red block. Isn't a transaction included in a peer message? If so, this should be reflected in the data flow.
Are there multiple fork/merge policies possible? If so, you could consider reflecting this with multiple boxes stacked on each other. Otherwise, just rename policies to policy.
Is there any data communication between the boxes inside each layer?

qstokkink commented 5 years ago

First out of 20 blocks done:

afbeelding network_plot.pdf

afbeelding distribution.pdf

cuileri commented 5 years ago

@qstokkink's work is basically the "Peer Discovery" module in @grimadas's architectural diagram above. With this module, we want to be sure that peers get to know as many as possible number of peers. We still need to make tests with larger graphs.

Next sprint will be on the next component of network layer, which we name as "fraud resilient overlay." Basically, this component will enable the nodes to build their 1-hop neighborhoods in the overlay network. The idea is to choose neighbors as diverse as possible with respect to the latency measurements, with the aim of decreasing the probability of a sybil to appear in one's neighborhood (see details here).

The graph below is a simple representation of how much we would decrease this probability. I used this latency dataset with 490 nodes and assumed that one of the peers creates 50 sybils around itself. On the x-axis are the nodes. On y-axis, we see the probability of the respective node to have a sybil as its first neighbor. Blue line indicate the random selection of nodes, orange lines show our strategy.

fig

fig.pdf

Next sprint: Building overlay network

Using discovered peers, each node will use a function F to have an ordered list of preferences. At first we can use a simple function which sorts peers in ascending order with respect to their latencies. Developing a function to have a preference list based on "neighbors as diverse as possible" idea is a seperate work.
We can use greedy b-matching to process preference lists: make/accept/reject matching proposals. Algorithms to use: (*, **)

qstokkink commented 5 years ago

First results of the overlay network, without diversity function F, for 64 nodes. The ping time is artificially delayed by the absolute distance in node ids. After +-70 seconds you know most of the network. After +-150 seconds all 64 peers have at least 14 diverse peers and have matched with at least 5 peers.

Initial findings:

Slow.
Require large set of known peers for a sufficiently large set of matching peers.

The following graphs show the peers (amount of open connections), ancestors (amount of peers included in diverse introduction chains) and partners (the amount of matching peers using the greedy b-matching). afbeelding

qstokkink commented 5 years ago

After scaling up to 300 nodes, we get more partners using the b-matching protocol. Further finding:

Choice of parameters matters a lot for the output.

afbeelding

qstokkink commented 5 years ago

This is with the same settings as the last post. Now with F function implemented (we use kernel density estimation to approach a given distribution of ping delays - 1/x in this example).

afbeelding

qstokkink commented 5 years ago

By periodically removing peers that have not been included in our latency edges, we get a more stable set of peers to select for matching (and a higher number for that matter).

Without cleanup: afbeelding

With cleanup: afbeelding

As our matching algorithm is stable, peers do not get dropped when a more favorable peer is available. Especially in smaller networks, this leads to some peers having very few partners: afbeelding

qstokkink commented 5 years ago

By properly pruning the matching partners we get better results from the stable b-matching algorithm. Most of the nasty side effects (in terms of messaging) of periodic cleanup seem to be patched up by having a larger number of maximum matches.

Up to 30 matching partners: afbeelding

Up to 60 matching partners: afbeelding

These results are for a network of 300 peers, where every peer has 180 open connections (of which roughly 120 are considered to have a unique latency).

grimadas commented 5 years ago

Noodle Protocol and Related Work:

We are building an optimistic P2P for enabling decentralised applications that values scalability and decentralisation the most. Being optimistic we need to handle occasional frauds in the system. To handle fraud we need reliable or at least predictable overlay. If malicious node cannot hide certain transaction we can detect fraud.
We want to build an overlay that will work in highly dynamic environment, for that we need to have following main modules:

Membership protocol
Partnership selection
Accountability of partners
Fraud resolution policy

Nice insides from the databases:

Many databases are eventually consistent (DynamoDB like, Riak, Cassandra) that are used in industry (Facebook, Amazon, Uber)
Moreover eventuality and consistency can be measured and surprisingly these databases give "good-enough consistency", see Bounded Staleness more more details.

Membership protocol

Many systems assume full membership and/or full network knowledge. This is not realistic in a dynamic network. We opt for a partial view of the network. Such partial view must be not biased and work under churn.

One of the most interesting approaches is based on gossip:

Brahms: Byzantine resilient random membership sampling
- Lightweight membership protocol purely based on gossip and choosing representative sample from the network. Can handle a high amount of Byzantine nodes.
- Easy to implement, each node need to maintain only subset of the network.

SWIM: membership -a little write up on early proposals in the filed: based on suspicion. I don't think it will work in our case though.
SCAMP: Peer-to-peer lightweight membership service for large-scale group communication
- Answer the question: what is the minimum partial view for quick convergence of gossip

Ofc, these protocols will not work if you have unlimited Sybils that can poison your local view and isolate you from the network. A combination of Sybil mitigation strategy and Brahms is definitely a good idea.

Partnership selection

Membership protocol is typically used to get connected and make it hard to isolate a node or partition the network. But they never argue how to select partner for long-term communication that would be optimal for you.

GoCast: gossip-enhanced overlay multicast
- Gossip is nice and has some nice properties, but if implemented naively it also introduces a lot unnecessary redundancy, where some node don't receive the message and some receive the message 14 times!
- Let's build self-healing multi-cast trees that are proximity aware

We can also select peers based on the capacity(declared bandwidth, or declared collateral balance for being a money router). We might then optimise the delivery of messages to a specific peer and build the system around that peer. Let's look at the concept of being interested in a peer as a topic in a PubSub System. Periodically peer declares a interest in peers as a vector of topics(or vector of trust). The overlay network organises itself into a pubsub system that gives guarantees on convergence and risk. For example, if you see that peer has to many subscribers with a low capacity you might prefer some other peer.

For inspiration, you might look into:

PolderCast: P2P for PubSub

Some of the gossip protocols require a verifiable neighbourhood, that is either build based on hashes or is always limited. One of the main downsides of gossip is freeriding, which in our case is Byzantine behaviour (peer not forwarding information to hide certain transactions). More on that in accountability section

Accountability of partners

If we say that information forwarding is crucial we might want to track if everybody is following the protocol and incentives good behaviour.

One of the earliest practical proposals is

PeerReview
- tamper-proof logs for verifying peers being a state machines executing some code
- the security is based on auditors - witnesses
- Witnesses are assigned deterministically based on hash - that requires full network knowledge!

Lifting: Freerider tracking in Gossip
- Freeriders is a big threat to gossip-based protocols. To save bandwidth they might not forward information, choose only certain peers for forwarding etc. This is exactly what we are trying to prevent/detect
- Although called lightweight it is based on 3-phase gossip and it raises the question if it is practical enough
Acting: Accurate freerider tracking in gossip, 2014
- A step forward from Lifting. Is based on the Peerreview logs where peer audits assigned peers for following gossip protocol.
- Assumes rational peers.
- Can handle massive leave and massive collusion of peers.
- No discussion how it would work in dynamic network and how to choose in this case partners.
Duck attack on acountability systems, 2017
- Such systems as Acting are vulnerable for duck attack
- Good news: there is a each fix

Fraud resolution

-Making P2P Accountable without Losing Privacy

Early proposals based on e-cash require a dispute manager
- Current work on offchain payments are using blockchain as resolution policies:
Pisa
Survey

We say that resolution policy is application specific and we aim for deterministic resolution in P2P. There are three ways to resolve disputes/frauds. For simplicity we look for resolving two conflict:

Include all. Peer made two conflicting transaction, like spending same tokens and trying to hide it, but some applications allow to go overdraft(credit system, Tribler), so a trivial solution to include all blocks to calculate the final balance work fine.
Both are invalid. You made a fraud, everybody will eventually cancel both transactions. Eventually is guaranteed through gossip and robust membership. But some peer might repeat the fraud, so some punishment must be consider too.
Choose based on weight. This is the hardest one. Each transaction is assigned with a weight, being it work based, stake based or any other rules. We say that each peer will converge to same block eventually (Related: Gossip based agreement).

Sybil attacks and latency

We don't need to solve distinct identities and one man one identity, a group distinctness is enough:

On the establishment of distinct identities in overlay networks, Bazzi et al. 2007:
- Your identity is the distance to k beacons.
- Assumptions: approximate symmetry, triangle inequality and latency is a lower bound of real distance
Hop chains: Secure routing and the establishment of distinct identities:
- Same concept but paths are now presented as cryptographic chain of identities.
- Since you know the shortest distance to k beacons your identity is this vector

The problem with these approaches are beacons that are central point of the system. You can make them fault tolerant but it is still a question how to choose them.

Another line of work that can help to detect sybils is beacons and is based on virtual coordinates. You try to map with an error Internet delays into a Euclidean space.

Most famous one is:

Vivaldi: A decentralized network coordinate system, 2004
- You start at a coordinate space and the goal is to predict the latency between two parties.
- Based on spring model and gradually building a coordinate system based on latency to other peers, you compare it with your prediction and you change your spring prediction to minimise the error.
- One of the few decentralised protocols
- Bounds for the Security of the Vivaldi Network Coordinate System
- As you would guess there are multiple attacks on the systems with even one malicious node screwing up whole system(Frog boiling attack, 2011)
- You can also look at Nice survey on the round trip prediction, 2018

Another interesting papers is based on rings, which are similar to our buckets: Meridian: a lightweight network location service without virtual coordinates, 2005

In short this is an area with established related work, we should be careful not to miss them.

qstokkink commented 5 years ago

We can also select peers based on the capacity(declared bandwidth, or declared collateral balance for being a money router).

There should be no need to depend on declared bandwidth, as we partner based on latency. The less bandwidth you have available, the larger the RTT will be as you transact more in the system. This has already been widely researched in the congestion control sphere. We can look into something like BBR: Congestion-Based Congestion Control.

grimadas commented 5 years ago

We can also select peers based on the capacity(declared bandwidth, or declared collateral balance for being a money router).

There should be no need to depend on declared bandwidth, as we partner based on latency. The less bandwidth you have available, the larger the RTT will be as you transact more in the system. This has already been widely researched in the congestion control sphere. We can look into something like BBR: Congestion-Based Congestion Control.

I think we should distinguish at least three types of peers:

The peers we choose as a communication mediators. You will use them for forwarding information mainly. They are important to be connected(protection from partitioning, see Membership protocol) and we want to verify if they are sybils or not.
Peers that we want to verify if they have a certain balance, because we want to receive a payment from them in exchange for a service from us.
Peers that provide a certain service to us and we will pay them. They want to have verify me.

Even in case of Tribler the peers are not homogenous, there are different roles(exit node, relay node, etc.). I think that fundamentally changes how you build overlay.

grimadas commented 5 years ago

There is a bunch of related work on accountability and reliable and scalable gossip/multicast. One of the key components to that is the Verifiable Neighborhood, which assigns a random members for interaction and verification. For example, in PeerReview witnesses and auditors are based on consistent hashing. In DHT your neighbors are chosen by the proximity of their hash.
Such systems can tolerant churn and malicious users, but there is always an upper-bound to it. This bound can be trivially broken with a Sybil attack, and further your local view will be poisoned which breaks all security/risk assumptions.

We are now working on a Sybil Resilient Verifiable Neighborhood.

Everybody follows the diversity protocol, which together with random sampling(like Brahms) gives some protection from Eclipse Attacks and network partitioning in a presence of Sybils in static environment. This is enough to use gossip information dissemination and discover peers that we want to interact with, but this is not enough to give risk estimation.

To calculate the risk of getting a invalid token(double-spent) or the the risk of not getting payed(outdated balance estimation) and finally time to detection(t-visibility) we need to know current and last interaction partners. Interaction partner is a peer that you want to be accountable, you want to validate their balance, calculate a function based on the history of the peer etc.

In other words, we want two answer three questions:

Do all partners know each other? (Agreement among partners)
Do I know all your current partners? (Staleness)
Are your partners good enough? (Sybilness)

To answer all three question and make a practical system is hard task, if even possible. For us it is good enough to answer all of them probabilistically, e.g. instead of hard staleness we can go with eventually complete knowledge(k-staleness), but in this case it is very important to bound the dynamics of the partner set, i.e. number of peers you can add/remove at certain time.

Suppose we have a function: Distinct({P_1...P_n}, k) --> {0, 1} It checks if the set of peers {P_1 .. P_n} contains at least k distinct identities, where k <= n and outputs 0 if there are less than k, 1 if there are at least k. This is function can be implemented using several approaches:

Resource testing: Communication, computation, storage, location (Proof of Bandwidth, Proof of Work, Proof of Storage)
Social network based (SybilLimit, SybilGuard etc.)
Some of the approaches are only locally verifiable like diversity testing based on latency or challenge/response with bandwith/cryptography.

To answer these questions we have two directions:

Direct agreement among partners

You(Peer A) discover enough peers and form a set of peers S that you prefer to be you partners. You send S to all of your partners from S and collect signatures. If peer signs S that means he confirms that he knows all you partners.
This set and signatures from the peers is enough to prove to anyone that you have a certain peer set. The set is valid only if has enough signatures. For example, if set is signed by all partners this is enough to answer 1 question. Because there might be faulty nodes this requires several round of interactions (2PC, 3PC), or lightweight agreement protocol, where peer A is a leader.
This set can be verified on the Sybillness via the function Distinct(S, 2)

It gives clear bounds and determinism on which we can calculate further transaction risks. It is only used when there is update on the partner set. For the stable partnership with will be used only once. This approach has established research and would be easy to argue about. It is not clear how it would work in a dynamic setting, more testing and discussion is needed.

Virtual agreement among partners

Tamper proof logs forms kind of virtual agreement among partners. Suppose we have a log with our partner set updates. If peer signs a transaction that is based on previous partner update this peer knows event that happened previously. Based on that we peer can confirm that after certain number of blocks who knows what.
To put it even further we can include link to the partner chain in a regular Trustchain transactions. That would potentially converge faster.

This is more elegant approach that reuses Trustchain ideas, but it is not clear how quickly will converge to a stable set and how to handle forks. Futher it requires synchronization of the chain and redundant interactions between peers.

qstokkink commented 5 years ago

I think we should distinguish at least three types of peers:
1. The peers we choose as a communication mediators. You will use them for forwarding information mainly. They are important to be connected(protection from partitioning, see Membership protocol) and we want to verify if they are sybils or not.

2. Peers that we want to verify if they have a certain balance, because we want to receive a payment from them in exchange for a service from us.

3. Peers that provide a certain service to us and we will pay them. They want to have verify me.
Even in case of Tribler the peers are not homogenous, there are different roles(exit node, relay node, etc.). I think that fundamentally changes how you build overlay.

I agree. What I am arguing is that there is no need to advertise specifically bandwidth. By partnering based on latency, we already have the emergent effect of partnering with peers that have sufficient bandwidth.

Regarding the partering agreements:

This set and signatures from the peers is enough to prove to anyone that you have a certain peer set.

I would reword this, to say that you can prove to anyone that you have at least a certain peer set.

Furthermore, it seems like the resource class of Distinct function is very expensive and measuring most of the mentioned solutions would also interfere with the Distinct functions of others.

Please also elaborate on how having distinct identities provides sybil-proofness.

grimadas commented 5 years ago

This set and signatures from the peers is enough to prove to anyone that you have a certain peer set.

I would reword this, to say that you can prove to anyone that you have at least a certain peer set.

We need a lock on the interaction partners, a certain commitment that peer will interact only with them for certain time/rounds. That will be used to calculate risks, no interaction is possible before providing this commitment. Every peer should eventually know about your commitment, so we can use gossip for that. This will also have guarantees on detection if done properly.

Furthermore, it seems like the resource class of Distinct function is very expensive and measuring most of the mentioned solutions would also interfere with the Distinct functions of others.

Please also elaborate on how having distinct identities provides sybil-proofness.

It is an attempt to generalize the Sybil attack in our context. In case of the consensus there is direct way to calculate impact: if you can break 51 % with sybils you are the king.

In our case one of the threats is Eclipse attacking by sybils. Suppose after sampling you get a set of peers, now you want to test if this set is unique enough. You require to estimate that at least k of n are unique. For example, 2 of n is enough to not be eclipse attacked, but probably not enough to build healthy overlay? For that you can test latency and diversity, or require do resource testing(computational challenge, storage challenge etc., bandwidth exchange). We should argue why we choose latency and diversity instead of resource testing. Both approaches are only locally verifiable btw, making it globally verifiable is not trivial.

qstokkink commented 5 years ago

We need a lock on the interaction partners, a certain commitment that peer will interact only with them for certain time/rounds.

This is a crucial bit of information that was lacking in the original explanation. I can see that working, agreed.

test if this set is unique enough

I think we need a stricter definition of what a sybil identity is then. It is pointless to argue which measure we use for sybil identities, if we do not have this defined from the start. In the case of Bitcoin we can argue that its proof-of-work is very bad at avoiding sybils, as the rational actors are forming mining pools and have to artificially be kept under the magical 51% mark. The only thing protecting against one of these attacks in Bitcoin is the fact that the 51% attack would also devalue the coin itself. Back to the definition part: where do we draw the sybil line in Bitcoin? Is a Bitcoin sybil a single machine with a lot of CPU and multiple addresses? Is a Bitcoin sybil a factory hall filled with asics? Is a Bitcoin sybil the entire mining pool?

Both approaches are only locally verifiable btw, making it globally verifiable is not trivial.

I will challenge the fact that it is even possible to do global verification of sybils without a single trusted party in control of all identities.

grimadas commented 5 years ago

I think we need a stricter definition of what a sybil identity is then. It is pointless to argue which measure we use for sybil identities, if we do not have this defined from the start. In the case of Bitcoin we can argue that its proof-of-work is very bad at avoiding sybils, as the rational actors are forming mining pools and have to artificially be kept under the magical 51% mark. The only thing protecting against one of these attacks in Bitcoin is the fact that the 51% attack would also devalue the coin itself. Back to the definition part: where do we draw the sybil line in Bitcoin? Is a Bitcoin sybil a single machine with a lot of CPU and multiple addresses? Is a Bitcoin sybil a factory hall filled with asics? Is a Bitcoin sybil the entire mining pool?

Totally agree with you that it is tricky with mining pools. What you essentially have you have multiple users colluding to contribute to one address, should they be considered Sybils or just colluding?

But let's get back to our case. Our goal is to pick overlay neighbours that are good enough and after pick interaction partners that are good enough. Goodness on the overlay is the based on forwarding information and not eclipse attacking you. Goodness on interaction partner is based on ??? Capacity? Balance? Trust? Reputation? Depending on what we choose the Sybil attack will change too

qstokkink commented 5 years ago

Goodness on interaction partner is based on ??? Capacity? Balance? Trust? Reputation? Depending on what we choose the Sybil attack will change too

Perhaps we can generalize this as the chance of your interaction partner finding any particular transaction you were involved in. Balance, trust and reputation are all derived from an (almost) complete set of transactions.

grimadas commented 4 years ago

For the next iteration we will focus on hardening the guarantees, giving a quantitative bounds on finality.

Goals

The main goal is build an accounting system which is simple yet powerful to support decentralised applications on top of it, such as financial markets, reputation systems etc.

Non functional requirements:

System working for millions of clients.
Has high availability: no single point of failure
Attack-resilient: adaptive to malicious peers.
Provide 'just enough' building blocks for applications, such as information convergence (consensus).
Simplicity

Means:

Bottom-up design: No total order, use local view as much as possible.
Replication with redundancy: there are enough connected active peers.
Limit the attack vector by limiting the world.

Science behind it:

Gossip network. The main building block of any blockchain system.
Probabilistic Quorum Systems. These constructions are also used in databases, amazon services, in recent meta-stable consensus protocols (this and this)
- Tamper-evident logs per identity. Each identity is associated with a log of events chained together. We avoid having one big global log.

Optimizations: TBA

synctext commented 4 years ago

Noodle == much more then double-spending and witness lists. That was new to me. 2 years of work have been invested in the channels (or main outcome of 2 years of re-factoring). Switching to The Noodle Stack (with 5 layers it's more then a protocol) is quite disruptive and took me by surprise as you obviously noticed (more discussion on 10June).

Set Reconciliation is a step towards the complexity of Dispersy with Bloom filter, something we which to avoid. For a lot of use-cases (e.g. channels) you can assume there are sufficient "seeders" with 100% of the information and you can simply collect it. Everybody is required to "seed" their complete trustchain record history. Simple linear list. Can you support that easily?

Channels only need linear downloads of bulk Magnet links plus filename(s). Therefore, for channels a single append-only linear list with logical deletes is sufficient. Current version requires a channel to be completely downloaded before records can be inserted. Partial downloads of subdirectories of interest are not supported. If we enhance the channel format to be downloaded in real-time, inserted and seeded partial we solved sufficient problems and can move on. For the roadmap and Kotlin implementation we need first documentation of the current channels format and commit structure.

Questions to discuss with a few coffees; Do you envision The Noodle Stack support the bulk exchange of 1+ million magnet links with congestion control? How does this work relate to #3690, trustworthy gossip? How do you solve the problem that most witnesses will be offline 98% of the time (unfiltered)? Witness reliability filtering is probably a requirement, but also an attack vector?

grimadas commented 4 years ago

Set Reconciliation is a step towards the complexity of Dispersy with Bloom filter, something we which to avoid. For a lot of use-cases (e.g. channels) you can assume there are sufficient "seeders" with 100% of the information and you can simply collect it. Everybody is required to "seed" their complete trustchain record history. Simple linear list. Can you support that easily?

I agree. I kept this part as simple as possible. If you have an order the reconciliation is much easier and Merkle Trees and Bloom Filters can be avoided. However, linear order specially a global one is not always possible or even desirable, for these cases we use reconciliation on a DAG. With noodle we enforce linear order on one writer-chain (personal chains - classical Trustchain). Community chains (multiple writers single datapoint) are optimistically linking to converge towards a linear log, but nothing bad will happed if it is a DAG.

Questions to discuss with a few coffees; Do you envision The Noodle Stack support the bulk exchange of 1+ million magnet links with congestion control?

That would be definitely interesting to try.

How does this work relate to #3690, trustworthy gossip?

Gossip is the heart and soul of Noodle. We just narrow down the gossip to only those who care (related to PubSub). In current implementation I use push-pull gossip with reconciliation to guarantee fast convergence. Partly tested with Wikipedia dataset, can converge within 15-20 seconds under heavy load for 1000 peers.

How do you solve the problem that most witnesses will be offline 98% of the time (unfiltered)? Witness reliability filtering is probably a requirement, but also an attack vector?

Witnesses are not required for the protocol to work (Guaranteed Eventual Detection). They are required to build critical applications where detection (+ grim trigger) might not be enough. Witnesses can be chosen as you like, but the best approach so far - choose peers in the community and make sure they are diverse enough. Witness audit is enforced by a counter-party and their risk assessment, so in the end, they will decide to accept/decline the transaction.

Witness availability is not a big issue since the witness set can be expanded up to everyone in the community. However, every community need to make sure that there is at least some number of available peers in the community. But this a research topic on its own, which I'll address later.

grimadas commented 4 years ago

Iteration progress and plans for the next sprint:

Repository

Noodle is moving to separate repository to easier management. For now it under it is under my personal GitHub, I suggest to move it later to Tribler repo, and include it as a reference in main Tribler repo. Noodle builds on top of IPv8, with IPv8 as a requirement.

Development plans

Noodle will be continuously tested and improved. The protocol will be documented and published as a pypi package.

If successful Noodle will join the pleiad of similar protocols: DAT, IPFS, Scuttlebutt.

Short term experiments and questions (1-2 Months):

What is the maximum performance of noodle block exchange. Can it handle 1.000.000 block download, > 500 MB overhead?
How robust is the gossip and set reconciliation?
How easy is it to support applications on top of the protocol? Such as Wikipedia moderation, file sharing, indexes sharing etc.

Medium-Long term research plans:

Battle-test 3 applications: DAO governance (e.g. collaborative editing Wikipedia), Trustworthy content meta-data exchange (for file-sharing), ???
BitTorrent is great tool, but it has narrow applicability, i.e. BitTorrent governance is based on Tit-for-Tat and is restricted to very small world - one swarm. As a consequence it restricted in the number of applications and is driven by the local optima. I would like to explore alternative mechanisms of governing common infrastructure. Using decentralised orchestration we can build self-organising dynamic community infrastructure that is driven first by the community good and (reputation, risk and trust) to avoid tragedy of commons (for example loss of a file, over-abuse of certain peers) rather than tit for tat.

synctext commented 4 years ago

Just for my understanding I'm translating the above into more abstract terms and MvP requirements for an accounting system which can underpin our live token economy.

cardinal functionality of the accounting system is to determine:
1. the contribution level of a peer to the common good
2. risk estimation for fraud, collusion, misreporting, false accusations, identity tampering, slander, including: tampering of entire communities (Sybil region, fake community)
there is a Tamper-evident logs per identity
All identities are part of a live overlay
Sybil Resilient Verifiable Neighborhood is used to enforce the diversity of Internet addresses
All log entries require signatures and be connected to the Single-Connected DAG component, as bootstrapped by the 100MB genesis block of trustchain.
witnesses require a minimal level of connectedness to the genesis block and not been recently created.
witnesses re-enforce the trust in bandwidth bandwidth accounting records
- optional mechanism
- two peers sign a trustchain record, witnesses also sign. Both peers optionally bring 1 witness
- witnesses may sign a small subset of the transaction of a peer to increase their trustworthiness
- witnesses must also check the Internet address of peers, thereby boosting the effectiveness of the Sybil Resilient Verifiable Neighborhood algorithm
- With a witness list and by locking witnesses for a certain time we are able to prove strong anti-fraud properties and double spending resilience (done or ongoing?)
witnesses act as a permanent store of verified and signed information
- witnesses include any co-signed record in their personal chain
- Each transaction may thus be permanently stored at four locations,
- Each participating party (witnesses and peers) must make it's complete chain available for bulk downloading, using congestion control
- as simple as possible rule: any participating party which is online and fails to produce co-signed records within reasonable time and sufficient retries is deemed unreliable
witnesses are selected by peers themselves.
- Diversity of selection is incentive compatible, your average path length to others decreases.
- It is also beneficial for load balancing and network health to randomly select witnesses.
- Consistently selecting hubs is discouraged, but difficult to actively enforce.
to avoid unneeded complexity we do not collect Trustchain record of peers with "holes" in it.
- easy to compress: you have either all record of a peer or none. A peer is either interesting or not.
- Each peer always has a certain [x.y] interval completely stored. Your storage state can be always expressed with two integers. For strangers your state equals: [0,0].
- This leads to storage state flavors:
- use a peer interest function to calculate the level of interest and storage requirement
- your own chain: fully store; former interaction partner: fully store, never prune.
- Everything you ever signed as a witness or interaction partner is permanently available.
- chains from others, you helped once as witness: decide to possibly fully collect, no obligation to append fresh records
- chains of interaction partners: decide to possibly fully collect, no obligation to append fresh records
- latest ratio-only: for certain peers of sufficient interest (e.g. good graph standing) you track their contribution level. Only keep 1 most recent record and overwrite older
- by-catch: collected chains have interacting peers you have no other knowledge of. These can be a single record or various scattered record (e.g. large "holes" between the information)
Expanding upon the Wikipedia bot idea; build automated self-regulating communities for social good
- witnesses can be fully automated bots without any human control
- they are meticulously designed to be incorruptible saints, without governmental, commercial, or human influence in general
- they use Bitcoin to pay for server space
- Can we support with such bots moderation practices like 'stars' for quality and 'forking' for non-stop evolution and mutation? What’s in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform
- future expansion brainstorm
- witnesses are currently volunteers, donated servers, or bots which rely on voluntary donations
- launch a bot from within Tribler through Bitcoin wallet and CloudOmate
- bots may derive income from bandwidth donations, which can be turned into Bitcoins on anyDEX
- builds toward a foundation for a personal data store using your own data bot/pod (highly publishable, Solid by Tim)
- this is agent-based systems finally done for real, see the "call to arms" for "Hypermedia Multi-Agent Systems", A Decade in Hindsight: The Missing Bridge Between Multi-Agent Systems and the World Wide Web
Our existing work:
- Clarify how Noodle improves upon our existing work from 5 years ago; "To avoid the impact of malicious users and information overload, we propose EscapeLimit. EscapeLimit leverages user interactions as indicators of trust and similarity between the corresponding users, and collects relevant and trusted information by limiting the escape probability into the Sybil area."
- Some older pruning work (no longer a fan of pruning, storage is too cheap to bother now) of 2012: the complete history of interactions accumulates old information, which may impede the nodes from capturing the dynamic behavior of the system when computing reputations. In this paper, we propose a scheme for reducing the amount of history maintained in decentralized interaction-based reputation systems based on such elements as the age of nodes, and we explore its effect on the computed reputations showing its effectiveness in both synthetic and real-world graphs.

BitTorrent is great tool, but it has narrow applicability, i.e. BitTorrent governance is based on Tit-for-Tat and is restricted to very small world - one swarm.

Yes, using Bittorrent for bulk downloading of gossip information seems like a hack. It actually is a hack. :warning: rant warning... with trustchain we abuse Bittorrent as a multi-sourced FTP service. In the history of our lab (since 1999) many master students, developers, post-docs or phd student believed that they could do better then Bittorrent or improve it. Everybody failed after trying for upto 4 years. Their designs are more elegant and superior. Their implementations never matched. They always added more complexity then a one-person project can handle. Serious people stopped trying to redo the Linux kernel or the leading compiler. IPFS and DAT people are finding out the hard way, Bittorrent is hard to improve upon. It can even support Tor-like privacy, if you beat it up for 5+ years. Engineers are trained to hate such hacks. Lesson: after our costly https://libswift.org experience, do not try to be superior to Arvid his brilliant code. I tried and it has cost me 650.000 Euro of EU money.

Please read about this gossip work done by our master student who was hired by Mozilla afterwards.. "Splash: Data synchronization in unmanaged, untrusted peer-to-peer networks". Do you want to make it as simple as such "sync points"? What do you think of their idea of using random gossip for "live mode" and Bittorrent for "history mode"?

grimadas commented 4 years ago

Well I agree that is really hard to match Bittorrent can be used as multi-sourced FTP service, but BiTorrent is much more than that, than it raises a question why not use something different.

But it as it was designed as for bulk downloads the only way to one when use Gossip vs BitTorrent is do heavy experimenting. My hypothesis gossip sync should be sufficient for most of the cases. As clients generally don't want or even don't need to download big bulk data, we might consider something more lightweight. I hope experiments will reveal that.

synctext commented 4 years ago

Tragedy of the Commons considered solved

On top of the trust layer there is perhaps the most interesting part of the application layer: the allocation of resources, preferential treatment, punishment of freeriders, community exclusion of cheaters. Combined they could be a new layer in your model: management of the commons.

Wikipedia, Linux kernel, Bitcoin, Bittorrent, and residential WiFi sharing somewhat manages to manage the commons, with limitations or with remaining central server. Scientific target: relentless automation and zero-server principle. Organisational principles:

fully autonomous participants
freely shared data
personalised risk calculations
stable collaborators and preferential treatment
Growth of social standing and leader-board rank

Relaying requires incentives and seeding requires storage and bandwidth. (open issue for 7 years to donate 1 TByte to the community #21). Open issues lossy accounting, overhead of accounting, risk exposure, double spending risk, settlement time, economics of anonymity.

proposed publication plan: always have 1 paper under review. Re-use OSDI/Usenix thinking. Design paper with early results of your accounting framework. Resource contention, 200k new trustchain records daily, exit node rules, preferential rules, scheduling algorithm. Paper by 21 September ?.. :clock10:

2022 insight: Do not trust strangers, only help them a bit to get going. SybilGuard and all these algorithms fail. Maintain stable collaborators in order to protect from unlimited identity creation. Critical failure to assume humans can detect fake identities from real humans. See 300 fake newspapers example.

synctext commented 4 years ago

Privacy with gossip: Spy vs. Spy: Rumor Source Obfuscation Real attacker do a Sybil attack (but still interesting related work).

synctext commented 4 years ago

@grimadas Novel idea: take Trustchain entanglement to the next level: link to any party and fixate their last block. Aug 1st goal of running code on DAS. Then solve Tribler problem of freeriding. You can be in 100 or 1000-ish counter-party groups which synchronise state. Towards the goal of knowing the contribution level to the commons of 1 million peers. Mission is to prevent the worst abuse and fraud (not exact accounting). Please document your design in 1 page by 14 July. Plus: https://dicg2020.github.io/

synctext commented 4 years ago

Reinventing The Internet for the Common Good

Within this article we present a scientific roadmap for addressing monopoly formation on The Internet and our economy at large by creating an infrastructure for the common good. We present results of our ecosystem we build over 15 years, inhabited by 1.7 million volunteers. Our work encompasses four isolated scientific fields. With our infrastructure for the commons good lens we unify the fields of applied social science, distributed algorithmic mechanism design, adversarial information retrieval, and traditional distributed systems. Our architecture contains five elements:

Our fundamental building block is awareness of outcomes when two economic actors meet. By spreading awareness on defection or pro-social cooperative behaviour we build upon solid foundations of game theory and prevent the price of anarchy. This is our scalable distributed tamper-proof accounting layer. Dissemination is conducted using gossip
Fraud filtering is the next element within our architecture. Based on historical encounters and transactions we must filter the lies, falsehoods, and fabricated accusations. This bears many elements from adversarial information retrieval. See the work by TUDelft faculty: TrustSVD: Collaborative Filtering with Both the Explicit and Implicit Influence of User Trust and of Item Ratings
Resource allocation is another key element, based on fraud-filtered historical behaviour. Only simple rules are used based on local available information which result in emergent increased social welfare. Complex systems are impossible to schedule centrally. Successful systems evolve further and complexity grows with enhanced sophistication and increasing popularity.
Governance is the critical long-term element. Who is in charge of the ecosystem and organises change? We focus on the special class of self-governance, where both nobody and everybody is in control. Collective decision making is based on democratic principles.
Application-specific logic is embodied in this final element. Bookkeeping, fraud prevention, and resource allocation are re-usable across multiple domains.

We claim our generic architecture is usable across business domains and a diverse set of applications. We focus the remainder of this work on the core ledger part. Our novel addition to existing body of work within ledger science is based on the strength of weak ties. We are the first to successfully apply the classical work by Granovetter from 1973 to dramatically improve the effectiveness of ledgers. We only add a single rule to a DAG-based ledger. Each economic actor randomly fixates the last observed state of another economic actor on its own ledger. The state of others is compressed into a single hash, resulting in a mere 32 bytes of storage cost. A special property emerges from adding random pointers in a graph with unknown properties: entanglement. Economic actors collectively ensnare each other.

We provide proof for the unreasonable effectiveness of weak ties for ledger science.

llegard commented 4 years ago

Would it be a stretch too far to talk about flexible networks without rigid boundaries, instead of ‘weak ties’? Could be a unifying argument about the power of permissionless systems.

grimadas commented 4 years ago

A paper under work now, working title: "Universal Accounting of Goodness".

Why accounting? Insights from accounting theory

The main reason for accounting and unified rules is to address uncertainty and information asymmetry. Information reliability and relevance is important for the business decision making.

We translate these properties to distributed systems, namely - consistency and validity of information. We recognise the fundamental difference between these too and address the most important adversity: information hiding and information tampering

But we also recognize that achieving 100 % reliability is not always possible nor desirable. As information retrieval and storage is conjugated with spending resources. We model accounting platform that is flexible and universal to support accounting for multiple domains. It allows users to choose different policies based on they risk model and adjust their behaviour accordingly.

synctext commented 4 years ago

@grimadas Can you take 15min today and post an progress update of the past summer months + all code.

A paper under work now, working title: "Universal Accounting of Goodness". Why accounting? Insights from accounting theory The main reason for accounting and unified rules is to address uncertainty and information asymmetry.

The leading organisational principle of developed nations is the market mechanism with a certain level of government regulation. Open access (commons) is an alternative to private control, however it is academically underdeveloped in terms of theoretical support. The tragedy of the commons is the key obstacle to sustainable commons. We claim that private control by Big Tech companies is difficult to avoid, without a solution to the digital commons problem.

We introduce the first viable solution to digital commons problem, based on accounting of social capital. To govern the common collectively we use accounting and unified rules to address uncertainty and information asymmetry. We provide empirical data on our global database of social capital.

:drum: Intro Bami! (please push the tentative code)

grimadas commented 4 years ago

Requirements for sustainable cooperation

1.Maintenance of cooperation:

Most of the peers should have good reputation
System should be able to withstand small fraction of freeriders
1. Efficient tools for bad peers identification
2. Punishment and Justification for the punishment
3. Forgiveness: there should be a way to get back.

grimadas commented 4 years ago

Main application focus of BAMI is contribution accounting to sustain cooperation. The requirements were given above. With main one if the identification of bad peers (Byzantine cheaters and freeloaders).

I see several different ways to approach this application. Here are my thoughts:

What is contribution?

Contribution - is a pairwise action from one peer to another that will be recorded as a transaction with two actions: contribution T₁: peer A -> peer B and confirmation (or explicit rejection) T₂: peer B confirms (rejects). Every contribution requires two signatures (pairwise accounting), this approach addresses misreporting, i.e. peer slander, peer A cannot falsely claim that it contributed to peer B with B's consent.

Contribution event - double-signed entry.

Can we do it differently?

If there is way to verify that contribution (work) actually happened yes.
Is there always a way to verify this? Probably no. Every work is unique - might be data verification, might be bandwidth sharing, cpu burning etc. Building verifiers for all these cases is an overkill. So we don't follow this path, instead we will use trust algorithms on top of work graph.

We don't verify the event truthfulness.

Can we always identify freeloaders?

As it is connected with verification and having global knowledge - more no than yes. Only in specific applications under certain assumptions (honest, altruistic majority, global consensus). The best we can dot to stay scalable with millions peers is to optimistically detect/estimate probability of freeloading.

We don't prevent freeloaders with 100 accuracy.

How can trust algorithms help?

They can help in two ways: Ranking or Filtering. Depending on the application requirement one might be preferred over the other.

This allows us to achieve magical properties, but only if we view it from the perspective of an honest node:

It doesn't mater if the work actually happened between two peers or not. We only need to estimate peer honesty (probability of being freeloader).
- We need to limit the effect of Sybil/Byzantine cliques in the work graph.

Best candidates for these algorithms for now are: variations of personalised hitting time.

We use trust algorithms to limit the effects of attacks

Is that global trust algorithm?

No. This is highly personalised, as the only trust anchor that peer has are 1-hop interactions. Peer can either verify all 1-hop peers directly, or it trusts them. (This is our assumption!).

This leads to an interesting result - some peers might consider honest peer a freeloader, and vice versa. However we show that with enough interactions work graph and trust algorithms lead to evolutionary stable state - freeloaders starving.

Trust function is personalised. Global ranking or filtering emerges with time.

Where trust algorithms will not help?

Peers can still hide information and cheat by presenting only partial information to bias the trust algorithms. This attack is related to forking, network partitioning.

To protect from such attack peers need to exchange data on the target peer. This can be achieved via gossip, periodical checkpointing and witnesses.

To simplify and decrease time to detection we connect together peers by topic (interest in the target peer). This turns blind gossip - into a targeted one (similar to pub-sub).

We use targeted gossip to minimise time to detection

What do we do with freeloaders?

As the trust is personal honest peers might see each other temporarily as suspicions. However I believe that we should block them, but instead give a limited service depending on the severity. For example, give less priority, or decrease speed. Thus, any freeloader would eventually starve.

Punishment for freeloaders - service quality degradation.

What do we do with cheaters?

Cheating is more severe - as it is a change in the protocol with an objective proof of cheating (justified). Either more degradation of the service, or temporary ban might be considered.

Cheaters get more severe punishment but only after justification.

Is there a way back?

There should be always an option to get back that is more preferred than whitewashing. For example, cheaters might fix all the forks and continue. Freeloaders will get the same quality once the ranking is increased.

Punishment are incentives for change.

Unaddressed issues and research questions.

Two counterparts might change their mind later. For example, imagine total bandwidth contribution a time t1: A -> B: 100 MB, B -> A: 0 MB. Later counterparts decided to change their contribution relationship (total contributed): t2: A -> B: 50 MB, B -> A: 0 MB. The total went down, do we consider it as valid behaviour or not? Should it punished?
All trust algorithms I encountered so far assume global knowledge of the work graph. This not realistic, how partial knowledge affect the error in the trust algorithm? We need experiment to show that.
As we entered the realm of imperfect accounting, a natural question arises: What is the minimal information I need to keep for the system to work?
How do we deal with multiple services provided by the same peer? For example, one peer might run bandwidth-as-a-service, or cpu-as-a-service or storage-as-a-service.
- Should the chain and gossip be agent-centric? (we gossip about peer-chain peer A for every service/app he is in), or app-centric(gossip about app-chain: shared chain of peers that run specific service/app), hybrid (chain of the peer in specific app).

The comments and more question are more than welcome!

synctext commented 4 years ago

We don't verify the event truthfulness.

Excellent thoughts! Please expand this into 6-pages IEEE before 21September.

Storyline: "Facilitating the emergence of truthful accounting". Our remarkably simple accounting mechanism does not record any details of interactions within a community, yet effectively suppresses freeriding, cheating and fraud. Our only mechanism is to irrefutably record the existence of an interaction between two community members. Paradoxically the mere recording of this interaction together with higher level mechanism lead to the emergence of global cooperation and starvation of freeriders. We successfully devised rules with the complexity of tit-for-tat with...

With one addition we can make our mechanism resilient against fake identities within a complete collaborative framework. New identities start with zero community standing and only receive service if they have made a net positive community contribution. Further details are outside the scope of this work:-)

synctext commented 4 years ago

The Truth Convergence Paradox. Mechanism to establish "single event truthfulness" do not converge to fraud-free communities as well as mechanisms which ignore truthfulness at this level and operate only using partial information, random gossip, and optimistic probabilistic detection.

synctext commented 4 years ago

Related work: trying to enforce single-event truthfulness at the cost of efficiency (destroys 300Mbps+ download capability of Libtorrent) "Treat-before-trick: Free-riding prevention for bittorrent-like peer-to-peer networks"

synctext commented 3 years ago

Bami Blocks have been born! Nice work :1st_place_medal:

Management concern: when the phd work is finished the outcome is either: 1) self-maintaining Github community 2) dead code in Tribler. Please only make a modest change to Trustchain, keep improving and expanding it as much as your like. But keep the first version of Bami integrated in Tribler clean of any non-Tribler features. Proposal: by 15 November there is a stable Tribler build on our Forum with Bami 1.0 ready. Time-boxed approach. List of potential issues to address (brainstorm):

cleaner code versus Trustchain
security fixes
more generic block type
keep it simple and elegant: only 1 block type "The Transaction" in a microblock on a chain
mimimal change of moving from a single personal chain to a DAG
- from 2-party blocks to multi-party blocks
- "Bami Blocks" still re-use the same "signature request' structure
- "Bami Blocks" with a transaction may optionally include multiple "Audit Block".
Audit blocks greatly reduce the ability to double spend
- experiment with new DAG for first time
- understand abilities and limitations
- fixates state of a random peer you include for global visible audit
- audit report are irrefutable due to signature inclusion
- Audit Block; contents like? given public key, sequence number, hash of transaction block, and signature.

Candidates for later improvement cycles brainstorm, after Bami 1.0 release:

discovery the work-graph
- goal is to understand and learn; imperfect system; learn and collect evidence to build the perfect system
- collect all records of counter-parties, 1-hop connections
- latest block crawling of some 2-hop connections and possibly 3-hop.
- strict rate-control, and database growth
first version of a peer ranking algorithm (1-hop,2-hop, stranger)
2-hop max-path algorithm, desperately avoid [graph algorithms(https://www.datacamp.com/community/tutorials/networkx-python-graph-tutorial) or expensive solvers. Should be a straight memory lookup.
spam attacks, first naive version

grimadas commented 3 years ago

After hard thinking and planning, I decided to follow the most straightforward yet powerful approach there is:

We will not use BAMI in Tribler. Bami will live under a separate repository as an academic testbed (for data structures, consensus algorithms etc.) generating fascinating insight, experiments and will be used in papers. It might create insights for Tribler product, but will never be used in Tribler, as it introduced another dependency Tribler does not need.
- We will not use Trustchain in Tribler. Trustchain is an excellent research project and served as insight for further developments, but it is time to move on. A more generalised version is now used at BAMI, where it will continue to grow and will be supported. As Tribler moves slowly towards a mature product with the user being at the number one priority, and don't see any reason to use Trustchain anymore. Tribler needs something simpler yet robust.

Simplest accounting mechanism for Tribler:

Use one block per counterparts pair where you keep that latest state of their interaction: total resource exchanged.
Block is double signed.
Block has no previous hash, no sequence number. Once a newer block is created, it replaces the previous block.
Block can be requested from any peer using only one query. Everybody can be crawler now.

In total peer will store only K block, where K is the number of counterparty peers. Even if you craw everybody you only store |E| blocks, where |E| is the number of interaction pairs between users (number of edges in an undirected graph).

Reputation mechanism

Latest experiments and research showed a surprising result - a reputation mechanism if mode properly to provide a substantial level of robustness, even in the presence of Sybils and fake data. Hence we refer all security and robustness to a reputation leaving accounting very very simple.

This reflect the vision and is sufficient to build sharing economy with minimal complexity.

synctext commented 3 years ago

wow! Thank you for your Tribler-centric/lab-centric thinking. So with barebones accounting, latency diversity, peer ranking, competitive upload spots we get spam flooding protection and Sybil-resilience? We need more?

grimadas commented 3 years ago

wow! Thank you for your Tribler-centric/lab-centric thinking. So with barebones accounting, latency diversity, peer ranking, competitive upload spots we get spam flooding protection and Sybil-resilience? We need more?

Yes, according to the latest research insights, it provides sufficiently solid security and allows us to build the first version of a micro-economy based on sharing 💪.

qstokkink commented 3 years ago

latency diversity

@grimadas Do you need this now? It's currently not part of IPv8's wild codebase (as the related research is not accepted anywhere yet). I can hook it in though.

devos50 commented 3 years ago

wow! Thank you for your Tribler-centric/lab-centric thinking. So with barebones accounting, latency diversity, peer ranking, competitive upload spots we get spam flooding protection and Sybil-resilience? We need more?

I don't think that we need the latency diversity for the first (barebones) version of the bandwidth ecosystem. I advise to consider this as a task for next iterations.

I will also update our ongoing write-up with these key insights.

qstokkink commented 3 years ago

Alright, I'll leave that out then. It would probably also only serve to hinder you anyway: the DAS-5 running 1000 instances per node would be detected as a Sybil attack 😄

synctext commented 3 years ago

The barebones accounting is now deployed successfully, dubbed LedgerZero. Please remain focused on an upload incentive. First make it work. Try to suppress the reflex of generalisation, into a generic incentive mechanism. Nobody has ever created an indirect reciprocity mechanism that actually worked and scales. Tragedy of the commons for 1 resource is unsolved. Leave moderation processes and metadata for the next generation of phd students in the lab. It is guaranteed to delay phd graduation if you try to improve upon this 175 page work. Actually, its even a broader problem I believe. Moderation is hard. We currently do not have the resources to facilitate spam-free and harassment-free communication amongst our users. Free text flow (many-to-many two-way) is too difficult to "Internet police", our scientific focus is first self-governance of 'filename moderation' (suppression of spam,fakes,worms,broken files, poor quality files, duplicates, misspellings, and bundles that are not bundled).

synctext commented 3 years ago

As discussed. Was the focus on freeriding for 16 years smart? In a quality community, do we then induce unconditional giving? permaseeding is the term I would like to introduce for the desired behaviour. Permaseeding enhances social welfare, as opposed to hit-and-run.

Platforms engagement and sharing: https://www.emerald.com/insight/content/doi/10.1108/JSTP-04-2016-0071/full/pdf?title=engagement-platforms-in-the-sharing-economy-conceptual-foundations-and-research-directions

synctext commented 2 years ago

Freeriding is very unsolved and Web3 makes it very relevant. Accounting protocol is making progress. Perhaps we can get 2 chapters out of it. One: algorithm only; repeated prisoner dilemma N-person; solved with gossip. Other deploy and measure with LedgerZero.

synctext commented 1 year ago

Sybil attacks are now investigated by six isolated fields. They are not aware of others. They lack the understanding of how fundamental their problem is. They are not fully aware of all prior and related work

IoT - EdgeTrust: A Lightweight Data-Centric Trust Management Approach for IoT-Based Healthcare 4.0
Blockchain - Sybil in the Haystack: A Comprehensive Review of Blockchain Consensus Mechanisms in Search of Strong Sybil Attack Resistance
Web3 Decentralized Society: Finding Web3's Soul
Tor Identifying and characterizing sybils in the tor network
Biology Identity, Kinship, and the Evolution of Cooperation
- Gossip and Ostracism Promote Cooperation in Groups
Peer-to-Peer Neighbor Similarity Trust against Sybil Attack in P2P E-Commerce

synctext commented 1 year ago

Unified model of trust and intelligence (dream outcome)

Two isolated field are probably much closer aligned than anybody realises. Unbelievable (really) similarities exist between theories of trust and theories of mind. With decades of scientific effort we might unify the Evolutionary connectionism architectures with our continued research into incentives and credit allocation (e.g. MeritRank) within collaboration humans (Our ongoing research for the past 23 years :astonished:). See Evolutionary Connectionism: Algorithmic Principles Underlying the Evolution of Biological Organisation in Evo-Devo, Evo-Eco and Evolutionary Transitions from 2016.

Prof Levin great writing from evolutionary biology: The collective intelligence of evolution and development. Published in the collective intelligence open access Journal he co-founded. Great quotes: Evolution is more intelligent than we realised, the Tribler Lab experimentally determined that "Randomness and evolution are more intelligent than we realised". Its counter-intuitive for scientific engineers that you can't outsmart massive scaled randomness. The relevant section: Section Credit assignment in individuals and collectives Contains this quote: How does scaling of reward dynamics bind subunits into intelligent collectives that better navigate novel problem spaces? Lessons from machine learning. It is no accident that the issue of credit assignment, and the application of credit to parts or wholes, is a central one in evolutionary selection, developmental and organismic biology and cognitive science. It is a feature of many difficult learning tasks that they require sequences of actions that are many steps away from ultimate goals – making it intrinsically difficult to incentivise the component parts involved.

Reward functions appear in biological systems, our collaborative systems based on MeritRank which address freeriding, and also in a state-of-the-art robot recing pilot. See the Nature Journal quote with the reward terms for reinforcement learning with collision avoidance:

_update_: another [fascinating Nature paper studies credits](https://doi.org/10.1038/s41593-023-01514-1). These are micro-level credits, to update weights (see [overview paper](https://arxiv.org/pdf/2403.18929)). The [Github code](https://github.com/YuhangSong/Prospective-Configuration) for backprop alternative. ```For both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output, a challenge that is known as ‘credit assignment’. It has long been assumed that credit assignment is best solved by backpropagation, which is also the foundation of modern machine learning. ``` The next step is possible if we ever managed to formulate the above dream theory. Collectively we could develop superhuman capabilities. We "only" need to overcome fraud, greed, and tyranny. **Make our species more intelligent, productive, and resilient by advancing the theory and practice of trust, collaboration, and democracy.**

grimadas commented 10 months ago

Scientific Foundation for Offline Token: