Tribler / tribler

Privacy enhanced BitTorrent client with P2P content discovery
https://www.tribler.org
GNU General Public License v3.0
4.86k stars 450 forks source link

Public decentral markets with privacy for traders #2887

Closed synctext closed 6 years ago

synctext commented 7 years ago

Financial markets offer significant privacy to trading firms. Leakage of market positions and trade history offers a competitive advantage. So traders will only operate on decentral markets if their privacy is protected. Regulators have obviously more access.

Builds upon: #2559

synctext commented 7 years ago

ToDo: system architecture figure + initial system design that links Tunnel community, orderbook in Python, relaying, spam-prevention, etc..

ghost commented 7 years ago

Report with Problem Description and Architecture

synctext commented 7 years ago

Please document DDoS problem (100Mbps for $5/month).

Problem of DDoS with Tor-based orderbook relays. Possible idea: start with zero help, directly spread your bid/ask, do trades, build trust, others will start to help you.

Prototype: back to 2013 code; proxies in network route your traffic. No Chaum remixing or Onion crypto. Trivial to match traffic with sniffing.

devos50 commented 7 years ago

Related work: Bitsquare (https://bitsquare.io):

bitsquare

They seem to use Tor together with mainnet.

synctext commented 7 years ago

Current idea to prevent bid/ask spam is to either use a cybercurrency or TrustChain (reputation based solution). Another option is to use this in combination with network latency, as documented here #2541.

Build a fresh new community within Dispersy which builds a low-latency overlay with network neighbors. Each peer which you see within this community you do a ping/pong handshake to determine the network latency. A random walk across the network does not converge fast, you only randomly stumble upon close low-latency peers. A small bias dramatically will boost the speed at which you can find 10 close peers in a 10 million group of peers. For instance, with a 50% coin toss you introduce either a random peer or one of your closest top-10 peers. Due to the triangulation effect this boosts convergence.

Next step is to build low-latency proxies. These tunnels are now fast and restricted to only a certain region. This addresses our problem as spam now is restricted to a certain region. Final policy to prevent spam is to combine the latency with tradechain reputation. You need both low-latency and sufficient reputation to be inserted into an orderbook. Peers with a bad latency connection need to compensate for this and buildup a higher reputation before they can start trading. note: current code avoids full Tor-inspired relay complexity, just proxy.

ToDo: incremental improve current code. Get 1 hop proxy operational. Add low-latency bias.

Current fee in Bitcoin does not enable microtransactions for bid/asks. It is $4 dollar to each KByte for 97.2% of blocks: image

Thus the best approach is to align all the incentives. Positive reinforcement within the ecosystem where traders with a good trade history get all the help they want. Traders without this history have an incentive to behave positively. How to solve the boostrap problem of traders with zero reputation on their traderchain? For instance, you need to help others and relay orders to buildup your reputation.

synctext commented 7 years ago

ToDo: incremental improve current code.

{thoughts} We did a latency measurement 8 years ago: http://kayapo.tribler.org/trac/wiki/LatencyEstimationReport Would be good for experimental results and incremental progress to have an operational and solid "latency community". A deployed system produces thesis results and good building block for proxy development. Measured latencies for proxies and limit orderbooks can then be used. possible planning, first finish experimental latency results, then proxies/trading/security. Or primary DDoS focus; self-reinforcing trust.

synctext commented 7 years ago

Ongoing coding work on latency community, proxies, etc :

image

ghost commented 7 years ago

histogram histogram2

synctext commented 7 years ago

Professional trading needs to be low-latency, private and DDoS proof.

ghost commented 7 years ago

Clear target build the lowest latency overlay. Two months: Experiments finished.

ghost commented 7 years ago

Experiment with 500 nodes crawls latencies

synctext commented 7 years ago

Nice progress! Next steps:

devos50 commented 7 years ago

@basvijzendoorn see https://cloud.githubusercontent.com/assets/3785124/22742749/33eb4aca-ee18-11e6-9898-675a23a287cf.png

ghost commented 7 years ago

@basvijzendoorn https://github.com/Tribler/dispersy/pull/526

synctext commented 7 years ago

Thesis-level Gumby experiment:

ghost commented 7 years ago

Upon introduction request: Predicting what the latency would be for the requester.

synctext commented 7 years ago

prime example of low-latency network, Bitcoin enhancement: http://bitcoinfibre.org/stats.html

ghost commented 7 years ago

https://www.sharelatex.com/project/592c19a601647e1979114c42 Dispersy community: https://github.com/basvijzendoorn/dispersy/tree/latency_overlay_backward Delft_University_of_Technology_Thesis_and_Report(1).pdf

synctext commented 7 years ago

Current status. Created a Dispersy latency community; but now moved into Dispersy itself. This implementation runs on DAS5, can measure node-to-node ping times, gossip these results using a dispersy-request-latencies message, and builds a hard-coded lowest latency peer discovery mechanism (thus killing exploration and randomness).

Using this collected ping times various existing network distance algorithms, such as GNP. Key challenge Upon introduction request: Predicting what the latency would be for the requester. Thus we only need to calculate the latency for a single node every few seconds. Scientific challenge: the algorithms are slow. Matrix of 50 nodes x 50 nodes with X, Y coordinates and assuming symmetric connections is 3 seconds with merely 1 itteration.

Instead of re-calculating the whole world state every 5 seconds we can:

Golden experiments:

ghost commented 7 years ago

Idea: Do real ICMP request to measure ping times without NAT puncturing.

ghost commented 7 years ago

Master Thesis link: https://www.sharelatex.com/project/592c19a601647e1979114c42 Current status: Read the background literature more carefully, understand the peer discovery mechanism better. Read and experimented with peer discovery code. Started working on writing about algorithms in literature in master thesis.

Centralized algorithms Vivaldi GNP Decentralized algorithms NPS PIC Triangle inequality, AS correction, geolocation: Htrea (2009)

Triangle inequality violation: TIV detection

Dynamic clustering Tarantula (2011) Toread (2010)

Latency measurements in P2P systems Latency in P2P Survey on application layer traffic optimization (ALTO) problem Applying GNP in P2P systems

Thought about incremental algorithm that recalculates the coordinates of a new peer plus his neighbors upon introduction. In normal conditions these are around 10 coordinates. With a fast walker around 30 coordinates are recalculated. A maximum number of coordinates for recalculation can be set. The coordinates set their new position based on the latencies of their neighbors. Thus when a new peer is introduced his measured latencies plus all the latencies measured of his neighbors should be send with the message. Peer introduction happens on: on_introduction_request on_introduction_response on_puncture

Idea on deleting "old" latencies: Delete "old" measured latencies after 10 walker steps are made. With a fast walker latencies are deleted after 30 walker steps. By this way the system becomes responsive to changing latencies in the system and the leaving of nodes out of the system.

Idea on latency measurements: Do multiple latency measurements and average them to get a better latency measurement and to prevent outliers. Latency can vary due to temporary calculations that block the system on a node. If some measured latencies appear to be outliers, they can be deleted. Use median of multiple (for instance 5) measurements.

Idea on metrics: Use ranking metric as described in the GNP literature. Also use relative error as new error function.

Project planning: First build incremental algorithm. Optimize and compare incremental algorithm to decentralized algorithm NPS with the error and ranking metric. While doing so document the project. e.a. explain background literature, peer discovery mechanism, new incremental algorithm, experiment setup.

synctext commented 7 years ago

System model:

Status: thesis has first experiments. Ready for experiments with incremental updates and runtime measurements. X-axis of number of known latency pairs, Y-axis depicts runtime in ms of network coordinate update. Possible different curves for accuracy settings.

ghost commented 7 years ago

Status: Have a working incremental model. Next steps: Experiments and tweak current model. Metrics:

  1. Ranking and Error summation metric for accuracy
  2. Algorithm runtime in ms
  3. Effects on decentral market

Latency sharing gives the possibility to report false latencies, message delaying. Possible solutions give some protection but not full protection.

Writing on the report.

synctext commented 7 years ago

Dataset: Cornell-King 2500 x 2500 node Latency https://www.cs.cornell.edu/people/egs/meridian/data.php

Current thesis status: chapter focus fixed.

Next step: solid experiment, focus on the core, explore trade-off accuracy and computational time, write 1-3 pages, already polished thesis style.

ghost commented 7 years ago

Current status: Dataset: king 1740 X 1740 latency nodes https://pdos.csail.mit.edu/archive/p2psim/kingdata/ Thesis status: Described and developed computational time metric ranking and relative error accuracy metrics, Experiment graphs added. Delft_University_of_Technology_Thesis_and_Report(2).pdf

Experiment one, two and three run.

Proposed next steps: Add more settings, Experiment four, Experiments with decentralized market.

synctext commented 7 years ago
ghost commented 7 years ago

Delft_University_of_Technology_Thesis_and_Report.pdf

Status: Experiment 3 and four done. Clean and ready to deploy code.

Proposal: Continue with writing. Experiment 3 and 4 measurements start after some time instead of from the beginning.

synctext commented 7 years ago

ToDo: First problem description chapter. With privacy and trading plus related work, state-of-the-art, and incremental algorithms.

ghost commented 7 years ago

Current status: Documented incremental algorithms and eclipse attack. Documented experiment 1+2. Low latency overlay resilience against eclipse attack.

Next steps:

  1. Do all experiments:

    • third and fourth experiment with all algorithms.
    • cost of entering experiment.
    • latency difference in privacy of market assignment measured with statistical difference.
  2. Explanation of algorithms.

optional-subtitle.pdf

synctext commented 7 years ago

Comments:

ghost commented 6 years ago

optional-subtitle(3).pdf

synctext commented 6 years ago

Quick comments:

devos50 commented 6 years ago

I think the title of this issue is outdated (the focus of this thesis has changed over time)?

synctext commented 6 years ago

Thesis progress:

ghost commented 6 years ago

https://www.sharelatex.com/project/5a3bc4af38c4a5721edbf694

synctext commented 6 years ago
ghost commented 6 years ago

thesis.pdf

synctext commented 6 years ago
ghost commented 6 years ago

Accuracy of top 10 latency peers. A new entering peer is dotted and a peer at the beginning of the experiment is solid. A new entering peer has an advantage because the quality of introductions are higher. ranking_normal2 Quality of Introductions of new entering peer. A new entering peer gains faster higher quality of introductions. quality_of_intro Quality of introductions during experiment. quality_of_intro_experiment

ghost commented 6 years ago

final_thesis.pdf

synctext commented 6 years ago
ghost commented 6 years ago

thesis.pdf

synctext commented 6 years ago

please fix: "In the default setting in the low latency overlay latency information is obtained every second with the ping-pong mechanism from every peer in the neighbourhood."

"the other 50% of node selections a peer with a low latency toward the selecting peer is chosen."

Currently implemented:

Proof of running code experiment:

ghost commented 6 years ago

thesis.pdf

synctext commented 6 years ago

Thnx for the thesis update! Getting a 100% working system, due to good predictive dataset? {Contacted 3rd committee member for master defense}

synctext commented 6 years ago

Completed: final master thesis report