Tribler / tribler

Privacy enhanced BitTorrent client with P2P content discovery
https://www.tribler.org
GNU General Public License v3.0
4.82k stars 447 forks source link

Thesis: Using distributed blockchains to cultivate trust in a peer group #2533

Closed pimveldhuisen closed 7 years ago

pimveldhuisen commented 8 years ago

Thesis: Using distributed blockchains to cultivate trust in a peer group

synctext commented 8 years ago

"Cultivating online trust in complex self-organising systems" Nice title/storyline.

Background story: we tried multichain, needs churn resilience, Tribler future-proof architecture means creating a generic "trust cultivation" module. Essential requirement is that this can be re-used in other systems, proven to work in Tribler.

Self-reinforcing trust is a more cold and rational term. Cultivating is more associated with organic and human touch. Interesting.

synctext commented 8 years ago

Design: why create records every 10min, not every 10MByte ?

Experiments brainstorm:

October goal: generate scenario, based on real behavior from our crawl. Either: give existing records or do 1 download/second for 3 hours == 10k downloads of Multichain records.

Perhaps with @qstokkink close the self-reinforcing loop and select non-freeriding nodes for tunnels.

synctext commented 8 years ago

Public ongoing thesis text : https://www.sharelatex.com/project/573db2896f2d756841e3def8

synctext commented 8 years ago
synctext commented 8 years ago

ToDo for thesis roadmap:

synctext commented 7 years ago

Related work @kc1212 @Captain-Coder @pimotte @wordempire @pimveldhuisen

synctext commented 7 years ago

Experiments:

Start to define thesis committee

synctext commented 7 years ago

Discovery experiment (Gumby) scenario:

synctext commented 7 years ago

Trace-driven simulator, inside Jenkins: https://jenkins.tribler.org/job/Tribler/job/Experiments/job/multichain/job/walker/ws/ bug: the exploration is not monotone increasing.

Conclude: 1 chapter on simulator. +Experimental section met above Gumby experiments, without tunnel community.

pimveldhuisen commented 7 years ago

bug: the exploration is not monotone increasing.

To clarify: the exploration is actually monotonically increasing, the bug is in the plot script, which renders this in a weird way. Should still be fixed ofcourse.

synctext commented 7 years ago

Current 31-page thesis

Planning: Dec - finish simulator plot. then document in thesis.

Chapter 2: please add the onion routing context as key use-case. There the 1:13 contribution ratio for default Tor-like setup is an unsolved problem. We aim to solve it for the somewhat easier 1:5 ratio scenario. Chapter 3: include the 10min Tor circuit lifetime specification, that motivates your design decision.

"Unconditional Random walk vs reputation-directed walk". Danger: how is this work different? Repeating the existing Dimitra experiments?

"3.2.2 Statefull vs stateless walk" incorrect terminology. Either walk to Dispersy neighbor in overlay or to a Multichain neighbor in overlay+shared datastructure. Dispersy chains the contacting of neighbors into a random walk, with probability of .5% for a teleport to a trusted bootstrap server.

Revisiting the non-backwards compatible design with deep walker integration. Unified primitive for:

Thus together forming trusted peer discovery suitable for NAT constrained Internet connections.

multichain_walker_design__2016

Each live edge request simply includes the records the sender already knows about. The notation can be a simple (start block number, last block number). The first live-reponse message always returns the last record, with the most recent transaction. This ensures that each subsequent re-visit we can reply with either fresh records, historical records, or an empty record.

synctext commented 7 years ago

Recent changes: design chapter msc_thesis (7).pdf

"3.2 Record discovery" Key problem to solve is a single peer spamming the network at 100Mbps (attack cost: €5.99/month). Random gathering makes it vunerable to spam. Pre-trusted nodes are also not a solution. So least-worse solution is to use multichain records.

Simulator is focussed on block discovery; without churn, no block creation, and infinite local storage.

Please make the load-balancing plot more dramatic: some nodes are going to crash. For instance, X-axes has all 700+ nodes sorted by their message-load, Y-axes shows their requests handled. Each node in the system is depicted as a datapoint, with a dramatic curve for the first dozen nodes on the left-hand side (no log-log scale needed).

Implementing "live edge walker" in Gumby estimated to be roughly 1 week. {Gumby darknet mode broken, possibly use clearnet; simple seeding records}

synctext commented 7 years ago

Good. Finish "Live edges" work in Dispersy, repeat nice load balancing graphs. Measure discovery and dissemination of records. See experiments above (+1 peer, etc.)

synctext commented 7 years ago

msc_thesis___Pim_Veldhuisen (1).pdf

Finish most thesis chapters or finish all experimental results?

ToDo johan reminder: committee form

synctext commented 7 years ago

msc_thesis___Pim_Veldhuisen (2).pdf

synctext commented 7 years ago

ToDo johan reminder: committee form

Theory + https://www.google.nl/search?q=the+shadow+of+the+future

Branching attack, mention double spending.

4.2 Experiments; Would call it emulation, not simulation. 4.3.2 Alpha Release of Multichain Start chapter 5 with What and Why. Role of Alpha parameter. Knowing who has cooperated in the past is a good building block for durable future cooperation. In this chapter we experiment with various algorithm to efficiently discover who cooperated in the past. Issues such as load balancing and discovery speed are key. 5 random walk versus focused walk parameter Alpha.

why continuous discovery from numerous sources ? attack-resilience...

5.2.1 Teleport probabilities. Still difficult to reproduce. underspecified. how many nodes with this 10k blocks? step time? how many connected neighbors? all their blocks when visiting somebody

Figure 5.3: The block collection process using different teleport probabilities. Explain that this is due to Dispersy random neighbors.

synctext commented 7 years ago

Page 3, try to say in first line why Tribler is relevent here. 2.0 intro line: for many decades we failed to solve the... 2.2 The Multichain : not a problem, e.g. tamper-proof records Figure 5.2: The distribution of probabilities ; more informative please "If itdoes not pick the peer with the highest score, it moves on to the next peer," feels like an arbitrary choice, lack elegance. "Skip Top Score Probability". "it loops back to the beginning"

We do not yet know workload of a true rewarding cooperation mechanism. Explore basic scalability and cost functions. Explore: storage cost of records for 10k to 1Million blocks? retrieve or lookup cost latency (single read of latest block of a single peer)?

synctext commented 7 years ago

msc_thesis___Pim_Veldhuisen (5).pdf Abstract 'that can achieve this idea in BitTorrent networks' please polish 'The fist step' 'The Multichain system as a whole is not yet completed, but this' make stronger and positive '1986 Science article Garrett Hardin' 1968 'Figure 3.1: A simplified representation' too small 3.3 Block creation protocol 'Denial of service' ? discuss spam attack to overload network or storage, not overload 1 victim. 100Mbps for $5/month attack ? 4.3.2 Alpha Release Forum link + date Figure 5.10: 'Load' more discriptive

Chapter 6: how do we conclude this is really correct? verified? 'The record creation is a replay of the trace acquired from the release Each' earlier and clearer: 6.x Experimental setup. 'Blocks over time' discovery speed node discovery speed, trust graph discovery... Revisits amount... goal of experiment? Record discovery experiment using 1000? Tribler instances Growing interval protocol specification, message layout 'After the record creation process, the experiment is paused for 1 minute, where nothing happens.' why?

Can a link with cooperation be made?

YourDaddyIsHere commented 7 years ago

Leveraging blockchains to establish cooperation - Pim Veldhuisen - Master's thesis.pdf

Final thesis from delft repository

ghost commented 7 years ago

1) p19 sha256 is introduced without citation

2) p21-23 re whitewashing -- If someone's client is honestly written but broken, it might accidentally lie. I can't count the number of times that tribler has crashed for me while I tried to get it to upload something (in fact it just did so while I was writing this post) ...at least for me, 99%+ of the time that I'm not uploading something I download it's because of a software bug. The exponential step-up of trust on page 22 is a clever way of dealing with this. It's certainly a blurry line between the cases of 'attacker' and 'broken code' but still, not all lies are intentional - intent should matter when punishing liars in principle. How might this look in practice? If A downloads from B, B has record of A downloading but A "forgets" and continues on as if nothing happened. Would A be a liar in this case and punished for it?

3)

Therefor, an optimal value for φ lies likely around 0.05, depending on the exact characteristics of the network.

φ is dependent upon how important reputation is. If the only purpose of reputation is to encourage the shadow of the future, maybe 0.05 is ok. But depending on the ratio of attackers/buggy clients versus honest nodes, the value of reputation may be different to different network participants.

synctext commented 2 years ago

Final .PDF: https://repository.tudelft.nl/islandora/object/uuid:0bd2fbdf-bdde-4c6f-8a96-c42077bb2d49/datastream/OBJ/download