Open synctext opened 6 years ago
IPv8/TCP (Trust Control Protocol). Game Theory direction title would be: using natural selection for the evolution of trust. Even more over-the-top: game theory of trust.
illustration from done by the Prof. Nowak team at Harvard. Their Nature paper. This model is designed for population growth and wrong for trust. But perhaps it can be modified to model the growth of trust.
A first version of the suppress, delete, store, collect, spread
protocol has been captured in https://github.com/Tribler/py-ipv8/pull/186. We have found that this roughly relates to the emotion of a user for a given incoming message.
(posting here, as it seems the best fit for now)
Any good, this related work? @qstokkink "Practical Homomorphic Encryption Over the Integers" We present novel homomorphic encryption schemes for integer arithmetic, intended for use in secure single-party computation in the cloud. These schemes are capable of securely computing only low degree polynomials homomorphically, but this appears sufficient for most practical applications. In this setting, our schemes lead to practical key and ciphertext sizes. We present a sequence of generalisations of our basic schemes, with increasing levels of security, but decreasing practicality. We have evaluated the first four of these algorithms by computing a low-degree inner product. The timings of these computations are extremely favourable. Finally, we use our ideas to derive a fully homomorphic system, which appears impractical, but can homomorphically evaluate arbitrary Boolean circuits.
@synctext not particularly.
Find vulnerabilities in the Gossip overlay. Try to do the following:
Document the security concerns
If possible, find a fix for / remove the threats
Attacks which should be prevented:
Impersonation attack: An attack in which an adversary successfully assumes the identity of one of the legitimate parties in the system or in a communication protocol.
Some Identified cases:
Allowing malicious peers to infiltrate the Gossip Overlay might be, generally speaking, the biggest threat to disrupting the normal workflow of this overlay: a malicious peer somehow becomes a trusted peer / neighbor, i.e. in the case of this application: becomes verified. However, this should not be possible, generally speaking, if only the (somewhat) trusted peers become verified. One way of augmenting trust might be to use the age of the peer as an indication of its perceived trustworthiness, i.e. how long has it been active. Alternatively / Additionally, one may take into consideration the number of neighboring peers a peer has (if the number is high, it's probably trustworthy, otherwise we can't say much about it); this might open the system up to Sybil or Eclipse attacks though, and might be easily exploitable. If a malicious peer does infiltrate the overlay as a verified peer, then it can do all sorts of undesirable things: mark good peers as suppressed, spread malicious messages, return malicious messages when queried with a COLLECT
request, tamper with their own message DB (avoid storing messages from a non-malicious peer, modify messages or do something similar - perhaps this would be a bit more difficult, however, since the signatures would probably not match, still, it may present a security risk -).
An example of such a case: a malicious, yet trusted peer flags another trusted, yet non-malicious peer as a spammer for SUPPRESSION, once this happens, there is no way to go back, and un-SUPPRESS
the peer. This is a serious issue, since it opens up many opportunities for malefactors to negatively impact the overlay and its peers by relying on other types of attacks, after suppressing many other (non-malicious) peers, and thus, implicitly, changing the network / overlay topology to one which suits their harmful intentions. SUPPRESS
should probably be addressed in some of the next increments of the Gossip Overlay's implementation, in addition to adding the general case of changing the Gossip Rule for a peer.
Attacks which should be protected against (cannot be prevented):
Eclipse attack: An attack wherein an adversary controlling a sufficient number of IP addresses to monopolize all connections to and from a victim crypto node. The attacker can then exploit the victim for attacks on a crypto’s mining and consensus system, including N-confirmation double spending, selfish mining, and adversarial forks in the blockchain.
Some Identified cases:
For instance, an adversary attaches itself to a (set) of non-malicious peer(s). In such a case, the adversary can, for instance, spam the SUPPRESSED message against other non-malicious peers to have them suppressed, thus elimintaing any 'positive' influence from the peers on which it leeches. From this point on, it can pretty much provide its own desired image of the Gossip Network to the peers it leeches, or it can disrupt the normal flow of the Gossip Network. Depending on the specific application domain of the Gossip Overlay, this sort of attack may have serious negative effects. For instance, if one would use the Gossip Overlay for gossiping blocks in a blockchain, this sort of attack could trick non-malicious nodes into agreeing on a fraudulent chain, it could alternatively enable the attacker to project a false majority in mining power, etc..
Even if the attacker does not suppress non-malicious nodes, it can still disrupt normal network functionality by, for instance, flooding the peers to which it is connected with bogus requests, which shouldn't be too difficult, since most requests are multicast / broadcast to all known neighbors. Perhaps introducing a limit on how much a peer can communicate, or limiting the communication scope of some of the requests might help towards reducing the network communication overhead.
Sinkhole attack: In this type of attack, a malicious node advertises itself as a best possible route to the base-station which deceives its neighbors to use the route more frequently. Thus, the malicious node has the opportunity to tamper with the data, damage the regular operation or even conduct many further challenges to the security of the network. In this case, the attacker might not necessarily tamper with the route cost function, but rather try to affect the topology of the Gossip Overlay such that it becomes a popular node (that is, it finds itself in a large number of neighborhoods) or becomes an essential node (that is, it stands on one of the few paths - or perhaps even the only path - between two neighborhoods).
Some Identified cases:
A verified (and malicious) node may become a sinkhole by telling other peers to suppress other (non-malicious) peers. This is turn could lead to a situation as dire as that described in figure 1, where a malicious verified node stands between two or more ‘clusters’ of peers / otherwise connected components (as defined by the ‘verified peers’ neighborhood). Communication between these clusters should always go through the sinkhole peer, which might be very problematic, since the sinkhole peer can now, amongst others, do the following: drop packets, exploit their contents, modify the packets, etc.
Alternatively, this may also happen in a sense by becoming a ‘neighboring peer’ / trusted peer to many other peers, or by becoming in a sense a central node to a cluster, as seen in figure 2. In both these cases, the adversary may receive a lot of messages which can be tampered with or snooped, etc. In the latter case, the malicious node may even project its own desired image of the Gossip Overlay to the neighboring nodes, similar to what was described in the Eclipse Attack section.
Should avoid against these attacks: monitor traffic for each peer, if traffic towards peer too high, possibility of Sinkhole attack → need another peer for redundancy → try to find another trusted peer which offers good routing possibilities.
Perhaps a good idea here is to use a trust metric which includes the number of peers which find a peer trustworthy (as directly specified by them, rather than the evaluated peer) → might be a good approach against the sinkhole attack, however this might encourage Sybil or Eclipse attacks. Additionally, this idea can be augmented to introduce a limit to the number of neighborhoods a peer can belong to, so as to reduce it influence (this is somewhat contrary to what I suggested in the Impersonation attack section regarding peer popularity and trustworthiness). Perhaps a lower bound can be introduced as well.
Sybil attack: In a Sybil attack, the attacker subverts the reputation system of a peer-to-peer network by creating a large number of pseudonymous identities, using them to gain a disproportionately large influence. A reputation system's vulnerability to a Sybil attack depends on how cheaply identities can be generated, the degree to which the reputation system accepts inputs from entities that do not have a chain of trust linking them to a trusted entity, and whether the reputation system treats all entities identically.
Some Identified cases:
SUPPRESS
message abuse: a peer may be SUPPRESSED by perhaps using a voting scheme. The problem with this is that a voting scheme may be subverted by Sybil attacks, where a node creates a lot of identities, which become verified, and overthrow the vote (peer verification is done at the peer level in the gossip overlay’s network object, however, it may be possible to create multiple peers, and try to implement this sort of attack). Method enforce → rule == GossipRule.COLLECT
: The Bloom Filter (more specifically its contents) might represent a security risk. For instance, a node is free to tamper with the information stored in the filter: add bogus messages (from real or fake peers), changing the contents of the messages themselves (perhaps difficult due to the private key based message signature), or discarding real ones.
When a message from a non-suppressed peer whose rule (i.e. that of the message) is DEFAULT
and source is not a neighbor, the self.update()
method is called, which calls the listener methods to process the newly arrived message. These hook methods may be an easy place where one could place malicious code. Also, in this case, the message will be deleted from the message DB of the receiving peer, even if no hook placed it there. Could this be a way of force deleting a message from a peer’s DB without its consent?. Alternatively, if the source is a neighbor, the message is stored, and the callbacks are also invoked. Obviously, if the peer is actually malicious, this allows it to insert bogus messages in message DB (but this is beyond the scope of this observation).
Method enforce → rule == GossipRule.SPREAD
. This might mean that a malicious peer (who is trusted either by Sybil or Impersonation attacks) might force spread malicious messages through the overlay.
Also, adding the possibility of code injection through the self.listeners
data structure (by demanding the addition of a new listener through a Gossip Overlay request - currently this is not implemented, but in case it will be in the future -) might be dangerous since the stored listeners get executed on the arrival of a SPREAD
or DEFAULT
request. The data itself can simply and conveniently be passed through one of the aforementioned requests.
Also, there currently is no way of setting the rules for a peer (via a Gossip Overlay request, that is), other than setting SUPPRESS
when told to do so by another trusted peer. Regardless of how this will be implemented it should be done in such a way that a malicious peer cannot abuse it. A good idea here would be to develop a solution which should work for setting any of the possible Gossip Rules, and still protect from abuse, thus making the solution reusable.
Currently, it seems that a lot of the available requests (nearly all of them bar the COLLECT
request - when sent, in the on_message
method, the COLLECT
branch -) communicate with all of their neighbors, i.e. a multicast type of communication. This might be dangerous, since a malicious peer might seek to flood the network with bogus requests. This may also apply when the peer is not really malicious, but rather just irresponsible when using the available network capacity. Perhaps introducing a bandwidth limit to the peers might be a good idea here.
Generally speaking, the SUPPRESS
request is quite dangerous due the fact that a peer can easily demand that other peers ignore neighbors (even if they are non-malicious). This allows malicious peers to abuse this functionality in various ways, for instance: to remove 'positive' influence from the neighborhood, modify the topology of the network, or increase vulnerability against other types of attacks. This particular request should be further refined, to prevent abuse.
Since most attacks are dependent on the ability of a peer to become trusted and join neighborhoods, then it's probably a good idea to seek ways through which malicious peers will find it hard or even impossible to do that. This will likely reduce the vulnerability of the system to other (more harmful) types of attacks.
Wireguard: related work on security, session cookies, DDoS protections and 4000 lines of beauty.
Getting our layering model clear: IPv8 delivers UDP messages. Also a discovery services, bit of DHT lookup. Should it stay away from sessions and shared state?
Brainstorming... Trustworthy_Gossip()
implements the "Gossip Control Protocol" (GCP on top of IPv8). This adds sessions, statefullness, and long-lived shared secrets? Plus Libtorrent-based congestion control and bulk data exchange. Trustchain also has/needs a bulk data transfer protocol?
Trustchain also has/needs a bulk data transfer protocol?
If libtorrent channels work out, this could also be a good candidate to sync with libtorrent.
Boostrap idea:
Very relevant: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4220960 http://www.ece.neu.edu/fac-ece/ioannidis/static/pdf/2011/IICM.recsys11.pdf https://hal-lirmm.ccsd.cnrs.fr/lirmm-00640735/file/F2Frec-10october.pdf https://www-sciencedirect-com.tudelft.idm.oclc.org/science/article/pii/S0306457316306471
Best idea so far:
Use a Polytomous Rasch model (Wikipedia) to reach a decision (SUPPRESS
, DELETE
, STORE
, COLLECT
, SPREAD
) based on the decisions (SUPPRESS
, DELETE
, STORE
, COLLECT
, SPREAD
) of others for a specific message.
Title brainstorm: 'middleware for provably trustworthy information flow'.
Some of the relevant work:
Brain dump 06 Mar 2019
Assumptions: - Ledger with consensus - Trust function
Payoff: The more trustworthy you are, the less disk I/O you need to do.
Trade-off: The more trusting you are, the less trustworthy you are (and vice-versa). However, once you have established yourself as trustworthy in the network, you can become more trusting. This means that new nodes will avoid you more as you become more popular, thus load balancing over the network and forming new trust clusters.
Idea The higher your trust score, based on evidence in the ledger, the less evidence you need to provide. Everyone can calculate your trust score, given the complete ledger. To maintain replication guarantees, you cannot decide what information to destroy/stop delivering. One mechanism to ensure this replication, would be the following:
Let h be a hashing function Let H be the domain of h, in bit length Let CSPRNG(iv) be a cryptographically secure pseudo random number generation function with initialization vector iv and domain H Let d be the Hamming distance function Let D be the maximum Hamming distance allowed to maintain replication guarantees for the given network context ( D < H ) Let t be the trust function defined on [0.0, 1.0] for a given public key
algorithm can_drop is
input: Public key Pk input: The information Ii to decide for, where i is the sequence number in the ledger output: Whether or not to drop Ii
if H - d(CSPRNG(h(Pk) || i), h(Ii)) > D + t(Pk) * (H - D) then
return T
else
return F
end if
Note that everyone can also determine whether or not you were allowed to drop information, based on evidence from the ledger and the same dropping logic. Failure to follow the protocol will lead to self-induced ostracisation.
Further note: this mechanism does not have suppression. But it may also not need the suppression primitive.
Here it is in function form for easier reading:
Related work called "safe gossip" by Maidsafe team, they did a badly executed, early, few million ICO. Push-pull gossip protocol. They use Scott Shenker approach from 2000.
We investigate whether such a large communication overhead is inherent to epidemic algorithms.
On the positive side, we show that the communication overhead can be reduced significantly.
We give an algorithm using only O(n ln ln n) transmissions and O(ln n)rounds.
In addition, we prove the robustness of this algorithm, e.g., against adversarial failures
struct Network {
pool: CpuPool,
// An mpsc channel sender for each node for giving new client messages to that node.
message_senders: Vec<mpsc::UnboundedSender<String>>,
// An mpsc channel receiver for getting the client messages and stats from the nodes.
stats_receiver: mpsc::UnboundedReceiver<(Id, Vec<String>, Statistics)>,
// The last set of client messages received via `stats_receiver` for each node.
received_messages: HashMap<Id, Vec<String>>,
// The last set of stats received via `stats_receiver` for each node.
stats: HashMap<Id, Statistics>,
// The futures for all nodes. When these return ready, that node has finished running.
node_futures: Vec<CpuFuture<(), Error>>,
// All messages sent in the order they were passed in. Tuple contains the message and the index
// of the node used to send.
client_messages: Vec<(String, usize)>,
// Message which when sent to a node via its `message_sender` indicates to the node that it
// should terminate.
termination_message: String,
}
The first experiment will be centered around decentralizing Twitter. It is based on a public Twitter data set of a flash crowd, over a one week timespan and for roughly 550k interactions. Here are the first visualizations of that data set and an elementary reciprocity function applied to it.
Stay tuned for the graphs on replication factor and cache misses over time.
The second experiment will probably use TrustChain.
Here is the replication graph after the first 100 hours of data processing (first +- 90 hours were 8 cores @ 3.2 GHz, last +- 20 hours were assisted with another 12 cores @ 3.4 GHz). This contains the first +- 180,000,000 data points (304k users).
As I don't feel like waiting another 400 hours I think I'll just make the graph datapoints a bit coarser.
Note that the graph does get more noisy as the relative amount of trusted peers increases (the [0, 200] interval and the [900, 1000] interval are captured at the same granularity). This is by design, as the probabilistic element of the replication function is starting to become significant to ensure replication.
Here are the raw datapoints:
First results for deduplication in a completely decentralized setting (CHECO + TrustChain in this case) using PageRank for reputation. First the amount of items that would be stored using CHECO normally:
Then with DELUT applied:
These experiments have a fixed 4 tps rate per peer.
Ranking of means (p-value < 10^-6) smallest amount of duplication to largest:
Related work: the I2P people also build a gossip overlay (they have 32k nodes). The Network-Database offers:
I2P's netDb is a specialized distributed database, containing just two types of data - router contact information (RouterInfos)
and destination contact information (LeaseSets). Each piece of data is signed by the appropriate party and verified by anyone
who uses or stores it. In addition, the data has liveliness information within it, allowing irrelevant entries to be dropped, newer
entries to replace older ones, and protection against certain classes of attack.
The netDb is distributed with a simple technique called "floodfill", where a subset of all routers, called "floodfill routers",
maintains the distributed database.
Floodfill is their networkDatabase update mechanism and seems vulnerable and under-documented.
The floodfill netDb is a simple distributed storage mechanism. The storage algorithm is simple: send the data to the
closest peer that has advertised itself as a floodfill router. When the peer in the floodfill netDb receives a netDb store
from a peer not in the floodfill netDb, they send it to a subset of the floodfill netDb-peers. The peers selected are the
ones closest (according to the XOR-metric) to a specific key.
Not unlike our community numbering they also have identifiers:
Related work, also for the @alexander-stannat work. "RepuCoin: Your Reputation is Your Power". Show how to "grow" a reputation in mid-life and mature system. Quote:
Related work:
Related work:
Based on a June 2002 algorithm. Gossip library by Cornell design: "Smudge is a minimalist Go implementation of the "SWIM: scalable weakly-consistent infection-style process group membership protocol" (Scalable Weakly-consistent Infection-style Membership) protocol for cluster node membership, status dissemination, and failure detection developed at Cornell University by Motivala, et al. It isn't a distributed data store in its own right, but rather a framework intended to facilitate the construction of such systems." GITHUB : https://github.com/clockworksoul/smudge
[Based on historical 2001 work "On scalable and efficient distributed failure detectors"
Super naive all-to-all. Oops, no security, reputation, spam measures, or trust! (because this was, like, 21 years ago)
2021 related work: BASALT: A Rock-Solid Foundation for Epidemic Consensus Algorithms in Very Large, Very Open Networks. Use also the idea of diversity to fight Sybils. we will consider two paradigmatic scenarios where an attacker attempts a Sybil attack using many IP addresses
Complex solution:
Quite clever bit of have dictated randomness with a pseudo-random function. Bit this still gives bias and is not strictly unbiased random. Difficult to predict if this design will actually work in Real NAT-based Internet.
definition by repeatedly exchanging neighbor lists with other peers,
discovering at each step peers that better match its ranking functions.
In the following, we first detail the use of ranking functions to
identify target neighbors. Then we discuss how nodes update these
ranking functions to make the random graph dynami
This thesis project explores making the connection between gossip and trust.
This work directly ties in with the mission of the lab Distributed Trust Design, see #3571. We are currently implementing incremental personalised pagerank #2805 and IPv8 distributed apps #2943.
This work shall demonstrate in 3 use-cases the linkage of IPv8 community gossip mechanism with trust calculations:
Sprint milestones:
Trustworthy_Gossip(suppress,delete,store,collect,spread)
Our #10 operational prototype from 2009 and 2013. Uses random graph walk: