Tribler / tribler

Privacy enhanced BitTorrent client with P2P content discovery
https://www.tribler.org
GNU General Public License v3.0
4.81k stars 444 forks source link

Geo-restricted communication, restrict talking to strangers #2541

Closed synctext closed 1 month ago

synctext commented 8 years ago

Investigate restricting communication for security

There are various techniques to prevent an adversary for creating numerous identities and overflowing the network. Latency is restricted by the laws of physics, thus a peer 5ms away has to be reasonable close. Geo-fencing is a well known term to restrict activity to an region by using GPS technology. One measure we could take is create a low-latency overlay. Peers with inherent 100+ms connections could possible take longer to bootstrap trust or overlay peers.

Another approach is to use the multichain mechanism. For instance, in darknet mode: the policy is to deny any connection with strangers. Then we start to deviate fundamentally from the random-circuit building approach, used in Tor. In our fully distributed setting it is difficult to protect yourself from various kind of attacks, like the Sybil attack. By not accepting connections from strangers we constrain a lot of attack classes. Every Tribler user that has a healthy set of neighbors to relay traffic with shall not interact with strangers.

Drawbacks: bootstrapping new peers. Do we need to leave hard-limited room for new peers? Positive: low-latency overlay boosts fast research results and regional content

EDIT: latest insight (2019) is that diversity of latency is much simpler to implement and also proven to be quite effective (perhaps not as intuitive that latency diversity is as effective as shielding yourself from far-away nodes.

synctext commented 6 years ago

Talking to friends only or building an overlay between friends is a F2F overlay. IEEE INFOCOM 2016, Anonymous Addresses for Efficient and Resilient Routing in F2F Overlays

synctext commented 6 years ago

On 4 Januari 2007 our following work was published on F2F networks. It build a social overlay for GMail and MSN network: Creating and Maintaining Relationships in Social Peer-to-Peer Networks

The latest F2F technology by others is the open source and light-weight Android client for Facebook, called SlimSocial. Installs according to Google Play stats: 100,000 - 500,000 Android devices. It uses the official Facebook website. Unknown legal issues or webscraping level of intelligence. Is it a browser or a bot(legally speaking)?

This is a project which meant to make people life easier.
This app will launch Facebook official website. All credit goes to Facebook team.
It is provided free for all without any Ads.
devos50 commented 6 years ago

It seems that SlimSocial is simply an app that allows users to navigate the Facebook website in a convenient way and gives them an alternative to the official app. The app is motivated by the huge amount of data mining and high battery usage of the official Facebook app. In essence, It is just a fancy wrapper around the Facebook website.

See also https://forum.xda-developers.com/android/apps-games/app-slimfacebook-1mb-0-permissions-t3254174

synctext commented 6 years ago

It is just a fancy wrapper around the Facebook website.

Yes, so this means there are no legal means of blocking or banishing it. This app, it's community and developer could be used to re-visit the social overlay origins of Tribler from 2005. But first focus on creating critical mass, still nice suitable for student projects (writing this down for 2019+ future usage):

A secure social platform using blockchains

create a privacy-respecting, zero-advertisement, open source, non-commercial alternative for Facebook. It will have superior security when compared to Facebook or Twitter. The aim is to have a passport-grade security guarantee against Internet-trolls. Our secure social media platform is impossible to abuse by foreign governments to influence elections. Security is established by showing your passport to other people. By showing others that you have a passport you make fraud much harder. Just social policing to keep us all safe. Your passport will never be stored, never be used for tracking, and no other commercial usage of any kind. If every user you see on this platform is guaranteed to have a passport, everybody is safer.

With your own blockchain we will establish identities and exchange them in the encrypted domain. How can we keep the passport validation completely offline and secure inside an app (example: #2812)? How do we securely store decentral signed friendship certificates? Secure storage of passport validation certificates? Can we devise a procedure where the physical passport is required to create the certificate remove the fake certificate vulnerability? For instance, some fixed digits of a passport code are required to be used as input for the certificate. Determine if privacy leakage can be avoided by using a deterministic one-way hashing function with passport digits and last names as input.

_You will implement these ideas on the SlimSocial app which can legally connect to the official Facebook website. Like a proper vampire you can suck the daily posts back into hands of the user, instead of residing only within cooperate servers. The right to data portability was laid down in the European Union's General Data Protection Regulation (GDPR) passed in April 2016. See Facebook Response to European Commission Communication on personal data protection in the European Union. With you app a dual-stack will be created for basic wall posts. All data will reside both on cooperate servers and encrypted on smartphones of the users under their full control. You work will enable unbounded sharing of magnet links of election news, world news, books, and videos in general._

synctext commented 6 years ago

plus our homomorphic overlay work for privacy-respecting overlays and our 2014 ReClaim: a Privacy-Preserving Decentralized Social Network

synctext commented 6 years ago

Game theory of honesty and trust is vital. Latency provides the theoretical grounding for addressing the sybil attack using game theory.

"Evolutionary games on graphs" by György Szabóa and Gábor Fáthb gives an intro to the application of game theory in this context. We build upon the cost of creating Neighbors on The Internet to defend against sybils. The mechanism we propose is that you create a bias against interactions with far away peers (e.g. stranger danger). A list of neighbors with honest behavior should scale. Cheaters get blocked, are forced to move elsewhere for any interaction (e.g. forced exile, digital ostracism). Builds upon: #3357.

synctext commented 5 years ago

restyling this old 2016 ticket for an IEEE article

Sybil-resilience through latency-based shadow-banning

Sybil attack can be greatly reduced by blocking spam from IPv4 address ranges with exactly the same latency. By measuring latency and triangulating multiple Sybil identities it is possible to detect their physical network attachment point.

model and mechanisms: Create Sybil-resilience by combining game theory and mechanism design together through latency measurements and the grim trigger. The design principle relies on the old saying: "Don't throw stones at your neighbors if your own windows are glass".

qstokkink commented 5 years ago

There is already some good work available for the triangulation itself:

Dutta, N. (2013). Location based services in wireless ad hoc networks.

Shewchuk, R. (2005, June). Star splaying: an algorithm for repairing Delaunay triangulations and convex hulls. In Proceedings of the twenty-first annual symposium on Computational geometry (pp. 237-246). ACM.

qstokkink commented 5 years ago

The simple approach:

img_20181203_144629

synctext commented 5 years ago

Strategic attack

We conducted a strategic attack on our network to evaluate the effectiveness of our Internet-deployed system.

-DRAFT-

Our evolutionary model of cooperation is based rounds and recording of help requests. Each request is either rejected or help is given. Trustchain only records given help. Group standing and image score are global communication mechanisms. Instant information is available at a global scale. " If group size is small, perhaps individuals can monitor the goings on of all others and thus properly attribute standing to a partner observed defecting on another. As group size increases, however, this assumption seems implausible. Language seems to offer individuals access to information about others that they were not able to observe directly. Integrating this hearsay with personally observed information, individuals may be able to accurately track the standings of other group members." Memory of the past is limited, restricted to a single round. In prior work "individuals live in an infinite, unstructured population". We model local memory of past rounds, limited to a geographical region. Trustchain records interactions in the local neighborhood. Prior work models information with a single probability. "To investigate the effect of incomplete information, we now assume that an individual knows the standing of his current partner with probability q; and with probability 1-q he has no information about his partner’s reputation."

Instead of standing and intent, our model removes this complexity and only revolves around acts of help. All acts of community service are recorded by the local community. We avoid global state such as in the 2003 'standing' or image scoring' papers. We build upon the earlier work dealing with stranger by expanding the suspiciousness notion.

qstokkink commented 5 years ago

Some notes on user space vs. system space (ICMP) messages.

All in all, I think it would be best to just focus on user space pings instead of ICMP. This also has the most practical value.

synctext commented 5 years ago

Solid progress. Could you gather sufficient experimental proof for the blunt proposition: IPv6 is more secure, in terms of Sybil attack resilience?

qstokkink commented 5 years ago

I think IPv6 is more secure only holds in theory, because it shouldn't concern itself with layered address spaces (NATs). In practice though, you have all sorts of IPv6 forwarding services, while in the background IPv4 is still used (like Teredo).

If we assume proper IPv6 support, then a simple traceroute would already detect machines running multiple identities. That would be easy to prove experimentally (with what would basically amount to a correlation attack for detection).

qstokkink commented 5 years ago

Actually, we could use traceroute for (IPv4 + port) as well. This would change the direction of this research though.

We would then be headed toward De-anonymization of pseudonyms using IP-layer correlation attacks

synctext commented 5 years ago

Calling it an attack instead of defensive technique, good iea. Present measurements comparing traceroute+user space ping efficiency, good depth. Please read "Correlating Topology and Path Characteristics of Overlay Networks and the Internet". Note, some 14 years ago I spend a lot of time doing actual measurements; my core expertise; compared to homomorphic crypto magic.

qstokkink commented 5 years ago

First measurement overlay is up and running: https://github.com/qstokkink/py-ipv8/tree/secret_project

Still need to format the results and scale up.

Right now the plan is to capture the (a) ping time and the (b) traceroute per user for:

I haven't decided on the amount of peers to attack/measure. Possibly 1000?

qstokkink commented 5 years ago

Measuring script up and running (anonymized for privacy):

[
    (('x.x.x.x', 7759), 1, 'ping', [
        (('x.x.x.x', 7759), 0.6202690601348877)
    ]),
    (('x.x.x.x', 7759), 1, 'traceroute', [
        ('a.a.a.a', 0.000728),
        ('b.b.b.b', 0.0025459999999999997),
        ('c.c.c.c', 0.005059),
        ('d.d.d.d', 0.006959),
        ('e.e.e.e', 0.00721),
        ('f.f.f.f', 0.005643),
        ('g.g.g.g', 0.012592),
        ('h.h.h.h', 0.022058),
        ('i.i.i.i', 0.024371),
        ('j.j.j.j', 0.0269),
        ('k.k.k.k', 0.029858),
        ('l.l.l.l', 0.024234000000000002),
        ('m.m.m.m', 0.024909)
    ])
]

We automatically crawl peers in the network and launch sybil measurements. Once I hook this into a database I'll start massively measuring random unique users.

Once I fill up the database with about +-1000 users, I'll start on creating the classifier and testing its efficacy.

qstokkink commented 5 years ago

Also added mtr in, doesn't differ much from traceroute, but for completeness:

[
    (('a', 7759), 1, 'ping',
        [
            ('a', 0.1242058277130127)
        ]),
    (('a', 7759), 1, 'traceroute',
        [
            ('b', 0.000629),
            ('c', 0.002883),
            ('d', 0.002649),
            ('e', 0.004945000000000001),
            ('f', 0.005211),
            ('g', 0.013555999999999999),
            ('h', 0.024626000000000002),
            ('i', 0.024738),
            ('j', 0.030513000000000002),
            ('k', 0.029182)
        ]),
    (('a', 7759), 1, 'mtr',
        [
            ('b', 0.0012),
            ('c', 0.002),
            ('d', 0.0018),
            ('e', 0.0019),
            ('f', 0.0038),
            ('g', 0.0129),
            ('h', 0.023399999999999997),
            ('i', 0.0228),
            ('j', 0.0238),
            ('k', 0.023600000000000003)
        ])
]
qstokkink commented 5 years ago

Memo on proof-of-age:

You can only prove your age in one of these three methods:

qstokkink commented 10 months ago

I completely forgot about updating this issue. The corresponding paper on this topic can be found here: https://doi.org/10.1016/j.comnet.2023.109701

qstokkink commented 1 month ago

Since the paper has now been published, this issue is complete.