Closed synctext closed 6 years ago
@synctext simulated network in local machine without port limit
It is almost done, I give the active walker with fake network endpoint, hence in the active walker's view, it is connecting to a real network(In other words, I deceiving the walker, make it believe it is walking in real network). So, I don't need to change the logic of the walker.
For Simulated network, we generate only node id list, and store the node id, private key and fake ip and port in database.
Every time the real walker (i.e. active walker) wants to take a step, it sends the message to a fake address and the simulated network will translate the address to node id and then generate the links and multichain-blocks on this node using determinstic random seed generated beforehand. (in other words, links are generated "on the fly", there is no node instance or link instance stored in memory, I store the node-id/(ip,port) lookup table in database to save memory, but I can also move that into memory)
I am doing the experiment and will uploads some figures later.
Since today is 3 July and the next meeting is in 18 July (still 15 days to go...), can we have a 5-10 minutes drop-in meeting before that? (If you like, we can meet after 5:30 PM of any days)
Number of Honest/evil nodes the walker meet:
Just as expected: 40% of the nodes in network are honest and 60% are evil, hence in long run, the nodes meet by the walker consist of 40% honest nodes and 60% evil nodes
@synctext 1.The database is removed, the node id and address lookup table is in memory now. 2.Now it can specify how many attack edge I want.
solid progress! Cool experiment: how much contact with evil nodes for certain walk parameters and attack strength.
Xaxis amount of attack edges. from just a few to twice more attack edges than honest nodes...
Yaxis percentage good vs. evil nodes discovered 0% evil to 100% evil
several line colors for random walk: black 40% reset back to home (alpha), blue 30%, green 20, red 10%. So red gets random walks across trust edges of length 10, far in evil sybil region.
Interesting and easy to program?
each dot in graph is an experiment. Say several dozen points to see trends?
@synctext Ok, doing experiment, will contact you after I got the results and graphs
@synctext
I try the reset possibility of 10%,20%,30%,40%. Only 40% shows significant effect on preventing visiting evil neighbors (but only in the scenario in which there are not too many attack edges).
So I also try reset possibility of 100% (which is the current policy of dispersy walker). And... yes, high reset possibility can prevent visiting evil neighbors (at the cost of low neighbor discovery efficiency, with same steps, walker with high reset possibility discover less blocks)
The above experiment always has 400k honest nodes and 600k evil nodes for each run. Fascinating, you don't discover all honest nodes anymore if there is such an overwhelming amount of attack edges.
Next step: thesis chapter Problem Description + intro.
Problem Description
sidenote. Majority attacks are real for network maintenance and protocol upgrades.
@synctext Since there is over one month before our next meeting, I think I can write more than Problem Description and Introduction: The core of the thesis is protection against some evil behavior by using transitive trust, we can tell the story like:
A brief introduction to current dispersy walker protocol (already finished a few months ago as your request).
Analyze the weakness of current protocol.
Design some attack specific scenario by utilizing those weakness, for example DDoS by using introduction-response message, poisoning (introduce toxic neighbors to you) etc. Since I have finished the virtual network, I can test those attacks and draw some cool graph (e.g. load balancing graph in DDoS scenario). In other words, this thesis should target some specific attacking scenario. If the problem is too broad, this thesis won't be doable.
All of those specific attacks can be mitigated by preventing walking into evil region. Random walker with probability to teleport home is a good way. Every time after walker teleport home, it will randomly pick one of the "trusted neighbor" to visit in next step. (i.e. it only trusts neighbors introduced by its "trusted neighbor", that is what we called "transitive trust")
As the results in the experiment I just finished, teleport home algorithm can reduce the number of evil nodes it visits. So, it works, story ends...
Does this story make sense?
By the way, wish you have a good vacation
makes sense, good stuff!
@synctext Here is the first draft, not yet finished but can still tell the sketch of the story: thesis-report.pdf
Comments:
Peer to Peer System is one of the architecture of Distributed System, according to [5], it can be further categorized as ”pure” Peer to Peer or ”hybrid” Peer to Peer.
the reader is now confused by the four different concepts you introduce.peer discovery
.hunting videos
too informal.Design of Transitive-Trust Walker
. Contains more engineering details of work others did then your own work. Less byte positions, more what, why, how of your transitive trust walker
design principles.4.7 Reputation System
start with transitive trust / PageRank / PimRanksteps required to exhaust 95% of peers
what is the scientific goal of this experiment?@synctext that walker is actually already live in TrustChain on devel
(see https://github.com/Tribler/tribler/blob/devel/Tribler/community/trustchain/community.py#L289).
Side note: because I wanted the IPv8 mechanism generic and decoupled it is also much more complicated. In fact, I consider the edge based walking to be the most complex code in IPv8.
@synctext Ah,that is a epic work.It is fundamental change in Walker strategy.
But the story of my thesis is adding improvement to "current dispersy walker" (the original one without live edge and take fully random walk). Can I still follow my story line by treating the original walker as current walker (it is now a "historical" walker, not a "current walker" any longer because of Quinten's work)? Because if I say the original walker does not exist any more, it will undermine the whole story line of my thesis...
Since all my experiments have been finished and I use the original walker as baseline, and the story line of the thesis report is also adding improvement to original walker, can I still use my current codes and follow my current story line? Because such big change (adding IPV8 and those new stuff to my clean slate walker) in the codes consumes to much time and I need to redo all experiments, which is also time consuming, but I really hurry to graduate, I am running out of budget...
My work now mainly follow the work of Pim Veldhuisen,adding improvement according to its limitation.
My story line is now:
In the experiment: I keep the edge between my walker and the trusted peer (the peer have blocks with us or the peers directly trusted by our trusted peers) alive, preventing NAT hole closed. So, instead of letting trusted nodes time out, we can make them available for longer time.
The teleport home walker also follow the strategy that: visit a neighbor A, A introduce B to me and my walker has a probability to teleport home, otherwise visit B and so on. That is the same with Pim Veldhuisen's simulation.
I also test another worker which take random walker but give the trusted peer higher probability. I test the two new walker using the original walker (no live edge, take fully random walk) as base line.
The improvement compare with Pim Veldhuisen work is: Pim Veldhuisen give all peers infinite life span. Hence a high reputation peer will always stay in his top 10 peer list hence will cause load balancing issue. And keep a peer alive forever means we can not clean sybils in our peer list using time out. So I give the trusted peer finite life span (but still 10 times longer than normal peer), hence we can make trusted peer available for longer time and clean sybils by time out, and with a finite life span, a high reputation peer will not have global impact in the whole duration of experiment.
And... as you know, the experiments are done and the results are good. I have change the simulated network to 30% honest peers and 70% evil peers, the results are still good. But adding the new features according to Quintens works will cost too much time... I am really running out of budget...
storyline is still: what walker works best in an evil majority environment..
ok, I will follow the current story line, trying to finish the new version of report before next week
latest thesis report URL: https://github.com/YourDaddyIsHere/MSc-Thesis/blob/master/thesis-report.pdf
@synctext That is not the latest... I forget to push the latest one to the repository these days.
I push the latest one a few minutes ago, now it is the latest one: Thesis Report
I am drawing some graphs to better illustration, will keep update it until the next meeting in 6, Sep
commenting.. thesis-report (3).pdf
@synctext latest version report
all engineering details have been moved to Chapter 3, Chapter 2 now don't have any engineering details. I choose to follow the story told in Chapter 1 to introduce the concept of peer discovery, trust and potential attacks in high level, smoothly move from the story to the formal model of peer discovery strategy.
I use the term "Multichain" and "TrustChain" interchangeably, because some of the references use the term "Multichain", replace all "Multichain" with "TrustChain" might make the reader confused, so I use the two terms interchangeably, according to the context
The load balance graphs have been changed to Pim veldhuisen's fashion, y-axis is visit count, x-axis is the node id.
For the experiment validating the sybil resilence ability, I add a graph where x-axis is time and y-axis is the number of discovered peers, that should give the reader a first impression of what is going on.
For unsolved problem, I do the experiment: deployed the victim and attacker in two machines with exactly same amount of all resources. The victim is a standard Walker, the attacker is an attacking scripts keep sending introduction-request or crawl-request to the victim. According to the result, using introduction-request as weapon in DDoS will cost the attacker more resources than the victim. But using crawl-request will cost the victim much more resources while cost the attack few resources. (that holds true even when the victim reduce the number of blocks to return to 1)
By the way, because the reserved days for defense are October 23 to October 25, can we figure out the committee member in our next meeting (21,Sep)? There is only 4 weeks to go...
solid progress, good results
@YourDaddyIsHere note that Figure 3.6 in your thesis shows a block graph, used as input for the Temporal Pagerank algorithm, not NetFlow.
Comments on this thesis version:
@devos50 Oh,thank you,that is a mistake in caption. In previous paragraph I said Figure3.6 is for temporal page rank
@synctext We do not schedule a next meeting last time, should we have a meeting before the deadline of handing in the final report? Since the defense is at 20,October, the deadline of handing in final report is around 13 October. I have time for every day of the following weeks.
@synctext Since the defense is in 20, October. The deadline for hand in the report is around 12, October. After that, I will have 1 week for preparing the presentation in defense, could we schedule a meeting at 15~19 October? I need some suggestions on my presentation. Otherwise I won't know whether it is good or crappy...
The latest version of my thesis is in this repository.
I will update it multiple times every day until the mid night of 12, October (Wednesday this week) @
@synctext OK, it is almost final version now, I have the feedback from both committee members now, I have revised the thesis report according to their suggestions, I think we should have a talk tomorrow (11, October) then I have one day left to move to the final version.
@synctext Have some changes a few minutes ago. the current slides: thesis defense further simplified.pptx
Too many slides for a 30 minute presentation, Max 40min. First 8 slides, remove half. Just present solution 1, "in my thesis I looked at a smarter solution". In general slide 1-25 could be made more in-depth and scientific. Quick detailed comments:
@devos50 @qstokkink The defense is tomorrow (20 October) at 10:00 - 12:00 in the morning. The room is HB.03.230 katwijkzaal
@YourDaddyIsHere I can't make it as I'm on holiday tomorrow, but good luck with your defense!
Final master thesis: Peer Discovery With Transitive Trust in Distributed System
Related work from modelling side using cellular automaton paradigm. Network Automata: Coupling structure and function in real-world networks. Our angle is not network topology, but the connectivity and trust integration. ToDo: Cellular Automata and game theory integration (e.g. Meritrank and AAMAS paper).
Currently all Dispersy communities have their own isolated walker. This is not very efficient. This issue aims to build upon the ongoing multichain work and create a trusted peer discovery mechnism with high-performance NAT puncturing.
Background reading (general):
Trusted peer discovery:
Technical docs: