Closed synctext closed 6 years ago
ToDo: system architecture figure + initial system design that links Tunnel community, orderbook in Python, relaying, spam-prevention, etc..
Please document DDoS problem (100Mbps for $5/month).
Problem of DDoS with Tor-based orderbook relays. Possible idea: start with zero help, directly spread your bid/ask, do trades, build trust, others will start to help you.
Prototype: back to 2013 code; proxies in network route your traffic. No Chaum remixing or Onion crypto. Trivial to match traffic with sniffing.
Related work: Bitsquare (https://bitsquare.io):
They seem to use Tor together with mainnet.
Current idea to prevent bid/ask spam is to either use a cybercurrency or TrustChain (reputation based solution). Another option is to use this in combination with network latency, as documented here #2541.
Build a fresh new community within Dispersy which builds a low-latency overlay with network neighbors. Each peer which you see within this community you do a ping/pong handshake to determine the network latency. A random walk across the network does not converge fast, you only randomly stumble upon close low-latency peers. A small bias dramatically will boost the speed at which you can find 10 close peers in a 10 million group of peers. For instance, with a 50% coin toss you introduce either a random peer or one of your closest top-10 peers. Due to the triangulation effect this boosts convergence.
Next step is to build low-latency proxies. These tunnels are now fast and restricted to only a certain region. This addresses our problem as spam now is restricted to a certain region. Final policy to prevent spam is to combine the latency with tradechain reputation. You need both low-latency and sufficient reputation to be inserted into an orderbook. Peers with a bad latency connection need to compensate for this and buildup a higher reputation before they can start trading. note: current code avoids full Tor-inspired relay complexity, just proxy.
ToDo: incremental improve current code. Get 1 hop proxy operational. Add low-latency bias.
Current fee in Bitcoin does not enable microtransactions for bid/asks. It is $4 dollar to each KByte for 97.2% of blocks:
Thus the best approach is to align all the incentives. Positive reinforcement within the ecosystem where traders with a good trade history get all the help they want. Traders without this history have an incentive to behave positively. How to solve the boostrap problem of traders with zero reputation on their traderchain? For instance, you need to help others and relay orders to buildup your reputation.
ToDo: incremental improve current code.
{thoughts} We did a latency measurement 8 years ago: http://kayapo.tribler.org/trac/wiki/LatencyEstimationReport Would be good for experimental results and incremental progress to have an operational and solid "latency community". A deployed system produces thesis results and good building block for proxy development. Measured latencies for proxies and limit orderbooks can then be used. possible planning, first finish experimental latency results, then proxies/trading/security. Or primary DDoS focus; self-reinforcing trust.
Ongoing coding work on latency community, proxies, etc :
Professional trading needs to be low-latency, private and DDoS proof.
Seems nano second range is the state-of-the-art for trading
DDoS is really important in the wild. It all happened in just 45 millisecond on GDAX market
The crash at 3:30 p.m. New York time on June 21 drove the currency down to 10 cents
from $317.81. The cause, White said, was a single $12.5 million trade -- one of the
biggest ever-- placed by a customer as a market order, or a request to sell immediately.
That pushed ethereum to $224.48, but the pain didn’t end there. The decline triggered
sell orders from traders who’d requested to bail on the currency if prices fell to
certain levels, and prompted GDAX to liquidate some margin trades.
ToDo: incremental progress. Deploy latency community with 1 extra message: get_recent_seen_latencies() which shows the last end-to-end response times at Dispersy community level with the last 8? digits of each IPv4 number obfuscated. Only use a single UDP packet for this gossip reply. Next: Crawl latencies within a Gumby experiment.
Clear target build the lowest latency overlay. Two months: Experiments finished.
Experiment with 500 nodes
Nice progress! Next steps:
@basvijzendoorn https://github.com/Tribler/dispersy/pull/526
Thesis-level Gumby experiment:
introduction-response
Upon introduction request: Predicting what the latency would be for the requester.
prime example of low-latency network, Bitcoin enhancement: http://bitcoinfibre.org/stats.html
Current status. Created a Dispersy latency community; but now moved into Dispersy itself. This implementation runs on DAS5, can measure node-to-node ping times, gossip these results using a dispersy-request-latencies
message, and builds a hard-coded lowest latency peer discovery mechanism (thus killing exploration and randomness).
Using this collected ping times various existing network distance algorithms, such as GNP. Key challenge Upon introduction request: Predicting what the latency would be for the requester
. Thus we only need to calculate the latency for a single node every few seconds. Scientific challenge: the algorithms are slow. Matrix of 50 nodes x 50 nodes with X, Y coordinates and assuming symmetric connections is 3 seconds with merely 1 itteration.
Instead of re-calculating the whole world state every 5 seconds we can:
Golden experiments:
Idea: Do real ICMP request to measure ping times without NAT puncturing.
Master Thesis link: https://www.sharelatex.com/project/592c19a601647e1979114c42 Current status: Read the background literature more carefully, understand the peer discovery mechanism better. Read and experimented with peer discovery code. Started working on writing about algorithms in literature in master thesis.
Centralized algorithms Vivaldi GNP Decentralized algorithms NPS PIC Triangle inequality, AS correction, geolocation: Htrea (2009)
Triangle inequality violation: TIV detection
Dynamic clustering Tarantula (2011) Toread (2010)
Latency measurements in P2P systems Latency in P2P Survey on application layer traffic optimization (ALTO) problem Applying GNP in P2P systems
Thought about incremental algorithm that recalculates the coordinates of a new peer plus his neighbors upon introduction. In normal conditions these are around 10 coordinates. With a fast walker around 30 coordinates are recalculated. A maximum number of coordinates for recalculation can be set. The coordinates set their new position based on the latencies of their neighbors. Thus when a new peer is introduced his measured latencies plus all the latencies measured of his neighbors should be send with the message. Peer introduction happens on:
on_introduction_request
on_introduction_response
on_puncture
Idea on deleting "old" latencies: Delete "old" measured latencies after 10 walker steps are made. With a fast walker latencies are deleted after 30 walker steps. By this way the system becomes responsive to changing latencies in the system and the leaving of nodes out of the system.
Idea on latency measurements: Do multiple latency measurements and average them to get a better latency measurement and to prevent outliers. Latency can vary due to temporary calculations that block the system on a node. If some measured latencies appear to be outliers, they can be deleted. Use median of multiple (for instance 5) measurements.
Idea on metrics: Use ranking metric as described in the GNP literature. Also use relative error as new error function.
Project planning: First build incremental algorithm. Optimize and compare incremental algorithm to decentralized algorithm NPS with the error and ranking metric. While doing so document the project. e.a. explain background literature, peer discovery mechanism, new incremental algorithm, experiment setup.
System model:
Status: thesis has first experiments. Ready for experiments with incremental updates and runtime measurements. X-axis of number of known latency pairs, Y-axis depicts runtime in ms of network coordinate update. Possible different curves for accuracy settings.
Status: Have a working incremental model. Next steps: Experiments and tweak current model. Metrics:
Latency sharing gives the possibility to report false latencies, message delaying. Possible solutions give some protection but not full protection.
Writing on the report.
Dataset: Cornell-King 2500 x 2500 node Latency https://www.cs.cornell.edu/people/egs/meridian/data.php
Current thesis status: chapter focus fixed.
Next step: solid experiment, focus on the core, explore trade-off accuracy and computational time, write 1-3 pages, already polished thesis style.
Current status: Dataset: king 1740 X 1740 latency nodes https://pdos.csail.mit.edu/archive/p2psim/kingdata/ Thesis status: Described and developed computational time metric ranking and relative error accuracy metrics, Experiment graphs added. Delft_University_of_Technology_Thesis_and_Report(2).pdf
Experiment one, two and three run.
Proposed next steps: Add more settings, Experiment four, Experiments with decentralized market.
Delft_University_of_Technology_Thesis_and_Report.pdf
Status: Experiment 3 and four done. Clean and ready to deploy code.
Proposal: Continue with writing. Experiment 3 and 4 measurements start after some time instead of from the beginning.
Split in parts. How to present?
implemented low-latency community! Uses Gumby, directed introduce, coinflip decentral algorithm on DAS5 with real Tribler community.
Goal-drive experimental section: the goal of our experiments is to compare various network coordinate algorithms to create a low-latency overlay.
Key experiment: cost of joining as new peer to a converged low-latency overlay.
When you come online: Cost in Bytes to find your lowest-latency neighbors.
iteration, 1 per second equals 3600 latency measurements per peer per hour. Fast discovery of lowest latency neighbors. When to throw away?
keep alive to lowest latency peer ever found! (not yet used) or drop highest latency.
3 or 4 algorithms with a name. Simple, Naive, Guido97, MicrosoftPaper2012, 10-steps, ?
Readable figures on greyscale. Not multi-colored dots.
X-axis?
Science: what is the optimal amount of randomness, 50/50 or 80/20 ?
"This lack of knowledge results in sub-optimal solutions in the above example." no thriller writing style, put the relevance in first sentence. like: another example of an algorithm with only sub-optimal solution because future events are unknown is the k-server problem. News events are unpredictable and journalists are sent to jobs, without knowing the next event. This results in inefficiencies". More fitting for intro chapters probably.
"Broadcast of bid or ask match request towards other peers." how to sell the code implemented around trade privacy?
ToDo: First problem description chapter. With privacy and trading plus related work, state-of-the-art, and incremental algorithms.
Current status: Documented incremental algorithms and eclipse attack. Documented experiment 1+2. Low latency overlay resilience against eclipse attack.
Next steps:
Do all experiments:
Explanation of algorithms.
Comments:
in general; lot of pieces of text. Making progress. Please focus on creating a single storyline. start at the start
intro of problem in either Chapter 1 or 2. Illustration:
Chapter 3: "In order to solve the complexity problems of the GNP algorithm in the decentralized Tribler setting we introduce an incremental algorithm approach to stretch the computation of the solution over time." too complex for opening sentence More like: privacy and latency matter for online markets. We now focus on incremental algorithms to predict the latency to a given Internet location. This is the cardinal primitive for building a low-latency overlay, as we shall demonstrate within our experimental section.
k-server is an explanatory example, but needs it's own dedicated Figure 3.1 ?
is the chapter focus incremental latency prediction.
where discuss: state-of-the-art and GNP ? The 2004 MIT Vivaldi system:
Mention methods, but no need to include other people formulas (unless you created a superior one).
reposition Chapter 4 (basic algorithm)?
Perhaps move Tribler peer discovery after all latency matters.
(last on ToDo) Explain how you made your own latency algorithm; including the 4 variants.
Quick comments:
I think the title of this issue is outdated (the focus of this thesis has changed over time)?
Thesis progress:
Accuracy of top 10 latency peers. A new entering peer is dotted and a peer at the beginning of the experiment is solid. A new entering peer has an advantage because the quality of introductions are higher. Quality of Introductions of new entering peer. A new entering peer gains faster higher quality of introductions. Quality of introductions during experiment.
please fix: "In the default setting in the low latency overlay latency information is obtained every second with the ping-pong mechanism from every peer in the neighbourhood."
"the other 50% of node selections a peer with a low latency toward the selecting peer is chosen."
Currently implemented:
and
pong``` exchange includes latency of 10 peer. each peer keeps a local ping history list. This linear list is pushed serially to all other peers. Keep counter of what is already sent per peer. (difficult to read from thesis text)Proof of running code experiment:
taking a step
.
Thnx for the thesis update! Getting a 100% working system, due to good predictive dataset? {Contacted 3rd committee member for master defense}
Completed: final master thesis report
Financial markets offer significant privacy to trading firms. Leakage of market positions and trade history offers a competitive advantage. So traders will only operate on decentral markets if their privacy is protected. Regulators have obviously more access.
Builds upon: #2559