Closed chrysn closed 4 years ago
b) it's mixing up whom dpc trusts vs. who claims to have a public repository.
I'd like to add that trust proof with level: none
is very useful for "advertising, but not trusting some id" and I am using it quite liberally. Other uses will see such trust proof and fetch such id on fetch all
, but without any initial trust. If people were liberal in using it than new ids would be rather quickly discoverable and id/trust network quite dense.
You could write a very simple automated bot where anyone can add an id + url, and get automatic trust proof level: none
. The only problem with trusting about anyone is spam etc. malicious content, huge repos etc.
Privacy problems can be addressed by using Tor, or downloading bundles of proof aggregated be other services and so on. I wouldn't worry too much about it yet.
level: none
proofs sound like a straightforward way to go, and a builder for those should be doable easily.
Maybe a CI-based autobuilder could have its secret key encrypted in the repository, with a key only known to the operator and provided by the environment. The identity would be advertised as "don't give this any non-none trust, we'll do our best to only emit none signatures ourselves but don't take our word for it."
Will give it a try.
Sounds good.
... and it's running: https://gitlab.com/crev-dev/auto-crev-proofs/ periodically wakes up and queries the GitHub and GitLab APIs for forks. (And external ones can be included in a text file, though I don't expect that this will be used often.)
That's about as much as I can do; if you're happy with it, you may want to give that ID a level:none
trust, or even put it in as an additional starting point.
BTW. I'd add at least one id to the others.jsonl
file as an example, because I have no idea what the format of it should be, and more people will get confused.
I am building a web interface for cargo-crev. https://bestia.dev/cargo_crev_web/info/group_by_author/
It should have all the review authors eventually. And a blacklist for bad, incomplete, obsolete repos or bad authors. @Kornelski has already made a larger list of authors and uses it on lib.rs. I am still working on it. The web app has a dedicated crev ID and github repo: https://github.com/cargo-crev-web/crev-proofs I think this could be the repo and ID to have a list of all authors (except the bad ones). The web app could have a link to export the author list in a way, that could be imported into local cargo-crev. Maybe a json file to download.
I found out https://gitlab.com/crev-dev/auto-crev-proofs Can we say that this is the central repo that has the list of "all crev-proofs repos"? Is it possible to add also "blacklisted" repos. Maybe with manually change to trusted:negative or similar? I would gladly use that list as the base for fetching repos for cargo_crev_web.
auto-crev-proofs scrapes list of forks, so it's a good place to discover pretty much all crev proof repos.
We haven't had to deal with abuse yet. When every fork is added automatically, then it is possible for an abuser to insert junk data. It could be spam (make sure to add rel="nofollow ugc"
to links). It could be DoS (a proof repo with gigabytes of junk, or millions of trust proofs for various URLs making fetch take forever, etc.). I'm not sure if we should be doing anything about it yet.
@kornelski, Is your list equal to auto-crev-proofs or do you have some more repos? I have 55 repos till now. https://bestia.dev/cargo_crev_web/info/group_by_author/ And I will check and add the rest from the auto-crev-proofs list (around 20): https://bestia.dev/cargo_crev_web/reserved_folder/list_new_author_id/ I have a blacklist of incomplete or obsolete repos:
[
"https://github.com/confio/crev-proofs",
"https://github.com/dmerejkowsky/crev-proofs",
"https://github.com/jonas-schievink/crev-proofs",
"https://github.com/scott-wilson/crev-proofs",
"https://github.com/sphinxc0re/crev-proofs",
"https://github.com/Thinkofname/crev-proofs",
"https://github.com/thorhs/crev-proofs",
"https://github.com/adeschamps/crev-proofs",
"https://github.com/bjorn3/crev-proofs",
"https://github.com/cole-h/crev-proofs",
"https://github.com/dirvine/crev-proofs",
"https://github.com/Eraden/crev-proofs",
"https://github.com/ffranr/crev-proofs",
"https://github.com/alaric/crev-proofs",
"https://github.com/Flakebi/crev-proofs",
"https://github.com/JamesHinshelwood/crev-proofs",
"https://github.com/LaurenceGA/crev-proofs",
"https://github.com/maccam912/crev-proofs",
"https://github.com/crev-dev/crev-proofs",
"https://github.com/Alxandr/crev-proofs",
"https://github.com/pimotte/crev-proofs",
"https://github.com/Alexendoo/crev-proofs",
"https://github.com/ivanceras/crev-proofs",
"https://github.com/Gaelan/crev-proofs",
"https://github.com/hgzimmerman/crev-proofs",
"https://github.com/leo-lb/crev-proofs",
"https://github.com/sgeisler/crev-proofs",
"https://github.com/otavio/crev-proofs",
"https://github.com/frigus02/crev-proofs",
"https://github.com/jplatte/crev-proofs",
"https://github.com/mchesser/crev-proofs",
"https://github.com/braunse/crev-proofs",
"https://github.com/traxys/crev-proofs",
"https://github.com/alexmaco/crev-proofs",
"https://github.com/VictorKoenders/crev-proofs",
"https://github.com/bmhenry/crev-proofs",
"https://github.com/gilescope/crev-proofs",
"https://github.com/stusmall/crev-proofs",
"https://github.com/bwbroersma/crev-proofs"
]
I use auto-crev-proofs and repos listed in the wiki.
Now I have this reviews on the cargo_crev_web: 2020-05-27 authors:55 crates: 514, reviews: 886
If you have more, I would like to know and to add them.
auto-crev-proofs
probably has any publicly know CrevID yet. :) 55 authors seems about right.
The current story behind
cargo crev repo fetch all
is currently to download dpc's crev-proofs and everything from there. This is a fast starting point, but no long-term solution, for a) it's centralized, and b) it's mixing up whom dpc trusts vs. who claims to have a public repository. (And either way it'll need better documentation).Suggested plan:
Either
[x] create a crev ID that claims it trusts everyone who ever published a repository,
This has the upside that it works with crev as is deployed, but the downside that an unprotected private key would be published, and we'd probably want to have a way to force crev not to give that repository any trust better than "none" ever.
[ ] or create a repository list format ("plain text file of git URIs", "JSON array" thereof or "anything with links" may be good enough).
Either way, then we could
fetch all
(possibly making it configurable)Estimating the complexity of the follow-up steps, I'd be leaning towards having a plain link list -- both the scraping and the PRs would be much easier that way. known_cargo_owners may lead the way here.
It might just as well be transported in git repositories, such that a
fetch all
would also traverse (unsigned, fully untrusted, "just as good as found anywhere on the web") links filed with any of the discovered repositories. This would allow easier scaling out, because if someone comes up with an enhanced way to find crev proofs (say, scraping keybase), they wouldn't need to make a PR to update the central CI, but rather have that scraper run into their own repository and just link that.One tricky issue that might need further discussion is the privacy aspect (a user running their own git server could practically see every invocation of
cargo crev repo fetch
across the Internet, as can github and gitlab), but that's not made worse by this proposal than it already is (for every repository that somehow winds up in today's tree will receive the same kind of pings).