Open dpc opened 5 years ago
To explain:
/- b
root -> a - c
\- d
In this case, if both b
and d
create a review, it will count only as one review, to prevent a
from being able to create and trust fake IDs, to increase review count.
it will only count as one review
Am I correct in saying that the count, in this case, is made from a
point of view?
Shouldn't we avoid trusting nodes who we suspect of trusting and creating fake nodes anyway?
Shouldn't we avoid trusting nodes who we suspect of trusting and creating fake nodes anyway?
Of course. But trust is sometimes broken, and accounts get compromised. Currently I believe that after crev
gets popular, everyone should require at least 2 reviews to trust anything. And such review should be "uncorrelated". This issue is a method to make sure no account can increase the review count by creating sock-puppet accounts.
I think this would be solved by computing the max-flow from the root to each package, where people are nodes. Does this match what you're looking for?
@pimotte I am not sure which algorithm you refer to, and I am not that well versed with graph algorithms. After brief looking at https://en.wikipedia.org/wiki/Maximum_flow_problem, I guess one could say I'm looking for max-flow, if all edges have flow of 1, the source is root of the WoT, and all trusted users that have reviewed the package are sinks.
Something to consider regarding max flow algorithm:
http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002600.html
I've read http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002600.html and I pretty much agree with everything there.
This might affect #329 .
I was naively thinking about implementing a proper max-flow algorithm, but something like tracking number for each node, and splitting it between trusted 2nd-level node is probably easier and can be done breadth-first.
TODO:
Could you elaborate a bit on what you mean with "tracking number"?
Other thoughts:
This last bit is actually the counter-intuitive part about all these mechanisms. If I trust Alice and Alice trusts Eve, but I don't trust Eve, than that should lead me to re-evaluate my trust in Alice. If a crev result shows me this, it's very easy to say "oh, but I want to downgrade Eve, so let's tell that to crev". If the user can see that crev is trusting Eve because of Alice, then I get to choose: I downgrade my trust in Alice, since her judgment of Eve affects it, or I adjust my assessment of Eve, because I trust Alice's judgment of her above my own. In neither case I'll input something directly related to Eve, even though she "started" this.
but something like tracking number for each node
That's just my way of summing up my shallow understanding of Advogato - keep track of how much flow each node has (some number).
After reading through http://www.squarefree.com/trust/trust.pdf
Looking at these trust metrics etc. I think there are some fundamental differences between how they work, and how I perceive a problem.
First, for me the trust is not binary. For some reason most (all?) the existing trust solutions are binary. Something is either trustworthy or not. In crev, there are levels (and maybe also "confidence"). So the trust is more nuanced. Even without any flow analysis, the trust level can be easily made to "dissipate" as the distance from the root grows.
Then, I consider everything subjective and want to embrace that subjectivity. There is no one global "this is trustworthy" vs "this isn't". It's all opinions, and depending on circumstances two different agents, with same knowledge, can reach different conclusions about what is trustworthy.
Third, a combination of the previous two: there is no fixed point where something is trustworthy. Even the same person, with the same WoT, can run multiple verification runs, with different parameters (trading: depth, required trust-level, required understanding level, required redundancy, etc for each other), and look for the most susceptible pieces of code to review themselves.
If I trust Alice and Alice trusts Eve, but I don't trust Eve, than that should lead me to re-evaluate my trust in Alice.
The subjectivity is the reason why I don't think it's necessarily the case. It would make sense if we considered everything objective. But if it's subjective, then it's totally fine to have a different opinion. With the confidence
stuff, there will be a way to possibly overwrite inferred trust if needed. But that will require a conscious decision of someone analyzing their WoT.
Originally I wanted flow analysis pretty much just to make the redundancy
option (number of required reviews) to be attack-proof. So that someone in the WoT can't just create sock-puppets to create "copies" of the review that would count as many different reviews.
So my thinking was - after finding all the reviews, just run some kind of a max-flow algorithm from all the matching reviews to the root, and drop/merge multiple reviews coming through the same trust-bottleneck to count as one.
The problem I'm looking at might not be a max-flow algorithm... . I'm only looking at flow of 1 everywhere, and I'm not interested in the flow value, but at actually pruning "duplicates". A naive version seems fast and simple:
However the result is not "stable" and depends on the order of how nodes were sorted, and how they picked the previous node to use.
But it seems to me that if every time when taking a previous node while having a choice of some other previous nodes, we note the unused previous nodes on the used previous node and allow subsequent "traverse the path back to root" to use the previous nodes that weren't taken like they were previous nodes of the currently consider node itself ... things might work OK? It seems to fix the problem where traversal uses a path that is needed for another traversal, instead of a path that no one else uses.
It's late and I'm probably rumbling, but maybe myself I'll be able to decipher it, once I get back to it.
This is related to #44 .
Right now WoT graph is build by just a cost-bounded flooding of a graph. This makes it possible for anyone to create a new CrevId, trust it, and this way artificially increase the possible count of reviewes for a given crate.
This algorithm should keep track of path (id(s) of that directly trusted this one), on each step to the root of the trust tree, and so that when calculating the count of reviews, it's possible to "merge" reviews coming out from a common path, for the purpose of calculating the total trust count.
Example: If you directly trust only one other CrevId, you can only have trust count equal to 1, for any given crate, no matter how many people reviewed it.