Tribler / tribler

Privacy enhanced BitTorrent client with P2P content discovery
https://www.tribler.org
GNU General Public License v3.0
4.74k stars 445 forks source link

adversarial search state-of-the-art #2547

Open synctext opened 7 years ago

synctext commented 7 years ago

Keyword search within a self-organising system is a challenging unsolved problem.

Detecting and removing spam has proven to be extremely difficult. Creating a trustworthy search service, out of unreliable and possibly fraudulent resources is a challenge. A starting point is creating a web-of-trust or other feedback mechanism.

Existing work:

Web of trust for voting within Tribler:

synctext commented 7 years ago

Real world $2.44 million fraud with Amazon reviews/votes, thnx @pimveldhuisen http://www.zdnet.com/article/exclusive-inside-a-million-dollar-amazon-kindle-catfishing-scam/

jellelicht commented 7 years ago

Most of these were quite useful for gaining some understanding on this topic, thanks @synctext and @pimveldhuisen.

I am currently looking into Fighting peer-to-peer SPAM and decoys with object reputation, and to a lesser extent parts of P2P-Based Collaborative Spam Detection and Filtering.

It currently seems that lots of partial 'solutions' wrt adversarial search exist and have been researched, but most often they heavily depend on some form of centralisation or have another major drawback.

jellelicht commented 7 years ago

Also, regarding an often-used WoT based system: the gpg shortkey issue that came up recently is interesting, but I am not sure if focusing on WoTs is wise at this moment in time, seeing as these are implementation details when looking at the state of the art of adversarial search. WDYT, @synctext?

synctext commented 7 years ago

@wordempire A lot of abuse, fraud and spam examples can be found in social media and e-commerce. So that is nice stuff to write about.

but most often they heavily depend on some form of centralisation or have another major drawback.

That is a perfect storyline! Anything more for self-organising systems or P2P? Stuff like, http://www.ece.umd.edu/~goergen/docs/sec-nwatch.pdf ..

Web-of-trust mechanisms can be a minority part of your report, halve, or the majority. Whatever makes the most interesting story. A list of partial, flawed, and fantasy WoT solutions would be ideal.

synctext commented 7 years ago

https://github.com/pimotte/msc-thesis

synctext commented 7 years ago

Fraud with search results with direct financial gain.

jellelicht commented 7 years ago

lee2006understanding: develops a model that looks at the link between user behavior/awareness and pollution of a p2p network.

jellelicht commented 7 years ago

yoshida2009controlling: shows that index poisoning is an effective way of dealing with copyright violations when looking at the Winny network for small sets of files. This approach has the potential to disrupt the network as a whole, which might or might not be desirable for an adversary.

jellelicht commented 7 years ago
synctext commented 7 years ago

OK, + add 4th or 5th section.

start .tex in https://www.google.nl/search?q=ieeee+format format

https://scholar.google.com/scholar?q=dht+poisoning https://scholar.google.com/scholar?q=link+farm https://scholar.google.com/scholar?q=kazaa+pollution Reddit HackerNews, upvote, shadow ban, etc. techniques https://scholar.google.com/scholar?q=collaborative+spam+filtering https://en.wikipedia.org/wiki/Stealth_banning Honesty among drug dealers, 90% satisfaction level with drug deals: http://dl.acm.org/citation.cfm?id=2488408 https://scholar.google.com/scholar?q=explicit+feedback+spam+filtering User feedback & moderation: http://www.sciencedirect.com/science/article/pii/S0308596108000955

the Tribler voting and spam prevention mechanism control D, Dispersy, show votecast sqlitebrowser ~/.Tribler/sqlite/tribler.sdb browse _ChannelVotes table create interesting plot

jellelicht commented 7 years ago

This was the user-study where the assumption that expert users can quickly assess whether something is spam is questioned: Lee, Uichin, et al. "Understanding Pollution Dynamics in P2P File Sharing." IPTPS. Vol. 6. 2006.

synctext commented 7 years ago

first warmup task: understand and plot key daya from AllChannel content discovery and voting mechanism.

Plot ideas:

jellelicht commented 7 years ago

I am currently still deciding on how to export all my thesis-related artifacts (no generated artifacts in repositories), but for now a preview of a plot from last week(in xkcd style so I won't accidentally include them in a report as-is): bars

Also quick question: Is there any more recent work than Niels Zeilemaker's thesis from 2010 regarding the search strategy used in tribler nowadays? AFAIS, search is done by first looking in the local data, and then asking your TasteBuddies for more info, but I could of course be mistaken.

After spending some time thinking about the directions we could to go with this project, I would like to expand on the concept of trust and taking into account the possiblity of trustees being compromised. Trustees in this case could be something like "friends", people with similar voting behaviour or perhaps even something that can best be described as "moderators".

Some issues that I would have to research/address/decide on:

jellelicht commented 7 years ago

I would also like to propose a different issue title, as "adversarial search" is usually used in the context of e.g. game related A.I. things. How about "Spam-resilient search in decentralized systems"

jellelicht commented 7 years ago

img_20170126_154624

jellelicht commented 7 years ago

Problem can be split in two parts:

  1. Prevent spam and spam-related meta-information from entering the network
  2. Prevent spam and spam-related meta-information from hindering the proper usage of the network
synctext commented 7 years ago

Survey paper possible elements:

synctext commented 6 years ago

ToDo:

jellelicht commented 6 years ago

Draft version: main.pdf

synctext commented 6 years ago

Draft feedback:

jellelicht commented 6 years ago

draft v2 main.pdf

jellelicht commented 6 years ago

Also @synctext , how would you like me to cite https://github.com/blockchain-lab/shared_vision_towards_programmable_economy/blob/master/tex/article.tex?

synctext commented 6 years ago
jellelicht commented 6 years ago

main.pdf

synctext commented 6 years ago

kelong_sybil_overview

jellelicht commented 6 years ago
ghost commented 6 years ago

Given how many legitimate news organizations and people are routinely labelled 'troll' by their competitors (RT / AlJazeera / CNN / FOX are, even if they are biased on questions of russian/qatar/US-blue-team/US-red-team interest) - the question of 'why did this work as effectively as it did' has an underlying truth component of 'because what they were saying was just as true of a constructed narrative of social facts as the competing consensus was'. It's not the whole reason why they are successful but if we're thinking about search and mass media we should keep in mind that in addition to the mass media perception shifting going on from one player in the 'troll account' narrative, there is great (perhaps greater) mass media perception management going on from the other player as well. Some success by the other players may serve to balance out the bias of the network itself in the favour of the incumbents.

To phrase in the context of, say, Kelong Cong's paper 3.3...the 'honest region' does not include either the blue or red team and everyone associated with it, both meatspace and bot, to the extent that shared, necessary illusions involved in group membership are held.

synctext commented 6 years ago

@ichorid See this ticket of related work. Especially the 8000 fake Twitter accounts.

ichorid commented 6 years ago

@synctext thanks, I'll take this stuff into account.

ichorid commented 6 years ago

related #3615

synctext commented 6 years ago

Broader vision, beyond keyword search. An extensive technical analysis of the threat model in troubled regions. Aid workers are exposed to difficult challenges, see On Enforcing the Digital Immunity of a Large Humanitarian Organization.

synctext commented 3 years ago

Status update after a few years: