crev-dev / cargo-crev

A cryptographically verifiable code review system for the cargo (Rust) package manager.
Apache License 2.0
2.1k stars 89 forks source link

cargo_crev_web - web server to query reviews from cargo-crev #334

Closed bestia-dev closed 3 years ago

bestia-dev commented 4 years ago

Hello, I like the cargo-crev system a lot. But it needs much more audience to spread. I think that the reviews of crates need to be on the web to allow anybody to see them. I prepared an idea. You can try it here: alternatives: https://bestia.dev/cargo_crev_web/query/btoi issues https://bestia.dev/cargo_crev_web/query/num-traits advisory https://bestia.dev/cargo_crev_web/query/protobuf The code is here: https://github.com/LucianoBestia/cargo_crev_web This is a prototype. I just copied the *.crev files from my cache to see how it works. I imagine the first question is: Every developer has a different list of trusted people. How to deal with it? I suppose that this reviews on the web are just informative. Better to have more reviews than less. Somebody with experience could suggest who to trust. This list could be modified any time. On the web server cargo-crev will update the cache/crev every day. The reviews are pulled from the repos of trusted persons - and their trusted persons.

I would like to hear what do you think about that? Thanks.

dpc commented 4 years ago

This is sweet! Ping @kornelski who is working on something similar/related.

The output is very terse, and could use some tooltip hints to explain what is what.

Every developer has a different list of trusted people.

crev can calculate a web of trust for a given ID which is public, and does not require any secrets exposed etc. A lot cargo-crev commands take --for-id <CREVID> and are able to work from the perspective of any other users. I could imagine one could specify their own CREVID in the web ui and that could be stored in a cookie or something. When not specified, some well-connected default could be used.

I have acquired http://crev.dev so if you're interested and I could configure something like http://web.crev.dev or http://crev/dev/webui to point/redirect to it. I don't really have much time to help on the code itself, but I'm very happy to help enable/unblock initiatives like this.

ffranr commented 4 years ago

@LucianoBestia this looks like a great initiative! I've had similar thoughts on the subject of increasing crev visibility. I wander if a README shield could link to your web service. See: https://github.com/crev-dev/cargo-crev/issues/335.

bestia-dev commented 4 years ago

Thank you. Some visual polishing is required. Now is just a sketch. I will look at the --for-id option. This looks very promising. I think that is important to have a "well-connected default" for all the people that don't use crev yet. So they will be curious and start to use it. My virtual machine at google is the free plan, very small and weak. They call it micro. I don't plan to make a production server there. After some time if this turns nice, we could start to search for a decent web server for real world use. Then will the crev.dev domain come handy. Can you imagine on crates.io one link for crev reviews? Just like repository or documentation ? I would like that. I miss that.

dpc commented 4 years ago

Eventually, we could just create a PR to crates.io that would integrate well with it. But we should have stand-alone version that is language agnostic anyway.

When this is ready, I think I can just sponsor some initial hosting, and later maybe we can find a sponsor etc.

bestia-dev commented 4 years ago

I am installing cargo-crev on my tiny web server. It takes forever. I am compiling from source. I plan to run: cargo crev repo fetch all to fetch all repos. Do you have an idea how big is that? I hope it is not in gigabytes.

How could I make a list of trusted people? But I don't want to create a repo for the web server if is possible. There will never be new reviews created there. Only fetch, verify and query.

kornelski commented 4 years ago

Fetch will be small in terms of bandwidth (1MB maybe). It is high latency, because it may need to make requests to dozens of repositories.

To make a list of trusted people, create your crev identity, and use cargo crev id trust <hash>. In the latest version you can also do cargo crev trust <url of someone's crev-proofs repo>.

You don't need to do that on the server. On the server you only need to fetch your repo where you've published everyone you trust. cargo crev repo fetch all will find the everyone from there.

dpc commented 4 years ago
> du -cksh ~/.cache/crev/
16M     /home/dpc/.cache/crev/
16M     total

@LucianoBestia if you're worried about space issues you should probably not use fetch all. There's nothing in cargo-crev currently to prevent malicious repositories being added into WoT somewhere. fetch trusted might be safer.

We should add to cargo crev some sanity - like limiting fetching time to 60s, each repo size to 32MB, and number of ids one can trust to 100, or something like that. Otherwise it's possible to do sock-puppet accounts with heavy repos etc.

I don't expect any abuse just yet, but eventually it will start to happen. Just like with PGP https://gist.github.com/rjhansen/67ab921ffb4084c865b3618d6955275f.

bestia-dev commented 4 years ago

It's alive ! Fresh reviews every hour (example): https://bestia.dev/cargo_crev_web/query/num-traits Today I installed cargo-crev on the web server. I wanted a separate identity for cargo_crev_web, and not my personal id. I created a new GitHub user "cargo-crev-web" and forked https://github.com/crev-dev/crev-proofs. I created a new $ cargo crev id new --url https://github.com/cargo-crev-web/crev-proofs. I added to trust cargo crev id trust dpc and published that. I scheduled the cargo crev repo fetch trusted every hour. It does not take much time and neither much space for now. I would like to add more trusted users. Do you have some list to suggest? Thanks.

ffranr commented 4 years ago

That's great, nice work!

You can find a list of proof repositories here: https://github.com/crev-dev/cargo-crev/wiki/List-of-Proof-Repositories

I'm not sure if they are trusted and they might already be trusted through @dpc .

bestia-dev commented 4 years ago

Thank you. I would like to have a curated and opinionated list from somebody more experienced than me. Not simply all the reviewers and probably not only the one trusted by dpc. A special list for this use case.

bestia-dev commented 4 years ago

I would like to add also a simple verify for the queried crate. Like: image

I tried $ cargo crev crate verify --unrelated num-traits but I get Error: Unrealated crates are currently not supported On the web server I will not have a cargo.toml for verify. It will be always some unrelated crates.

kornelski commented 4 years ago

It needs to do dependency resolution, so there's no easy way around that. You could create a Cargo.toml in a temporary directory and let crev use that.

If you don't want to use cargo.toml, you'll have to do dependency resolution yourself, and then read info for each crate from crev-lib.

dpc commented 4 years ago

In the long run I would generally advise using crev-lib instead of cargo-crev binaries. Also - please remember that we might soon have more languages supported than just Rust.

bestia-dev commented 4 years ago

I will take a look at crev-lib. Thanks.

kornelski commented 4 years ago

I've added the reviews page to lib.rs:

https://lib.rs/crates/num-traits/crev

bestia-dev commented 4 years ago

Very nice @kornelski :-) I like Lib.rs a lot. It is quite different from crates.io. But everybody goes to crates.io because of the cargo.toml dependencies. Probably a lot of developers don't know about lib.rs. I didn't for a long time. I added a summary of the crate revies to my web page. The number of strong ratings, of positive rating,... of advisories. The summary is per crate and per version. And hints on mouse-over are everywhere, where there is no other label. https://bestia.dev/cargo_crev_web/query/num-traits Now I use a html templating approach, so that someone can make look good the html+css. He can do it with just 2 static files on his computer. No server, nothing dynamic. Then I add comments in the html, to replace the static content with dynamic.

image

kornelski commented 4 years ago

Oh, a table summary is pretty cool.

Another idea I've had, used in the GUI I'm writing, is to distil all the reviews into a numeric score:

Screenshot 2020-05-12 at 18 55 04

and I'm counting negative reviews separately from positive reviews, so you get two scores, and you can see when opinions are divided.

The score is weighed by trust * thoroughness * understanding * version_closeness for the score to be biased towards higher quality, more relevant reviews.

dpc commented 4 years ago

@kornelski : At this point lib.rs seems much better than crates.io to me, even without crev reviews. It's really awesome!

@LucianoBestia This table is really cool. Minor detail idea: change is column titles to either icons, or just remove completely and rely on colors + tooltips.

When designing reviews, one of my goals was to avoid specifying one universal formula for rating things, and allow different tools to experiment with rating systems, so I enjoy a lot seeing different ideas in this area. :)

bestia-dev commented 4 years ago

I added a "filter". A click on most of the numbers in the summary will filter the list of reviews. I added separate links to lib.rs and crates.io, to emphasize them as comparable alternatives.

I think I don't have enough trusted reviewers to see enough reviews. @kornelski how did you choose your trusted reviewers? Maybe is a good idea to add them all and then blacklist (remove) the problematic ones?

image

kornelski commented 4 years ago

For now I've fetched everyone I could. If that gets problematic, I'm going to require trust level > none and bootstrap it by trusting several well-known users.

I'm adding rel="nofollow ugc" to links, so hopefully nobody will bother to spam it.

I could also add voting "helpful/unhelpful" to the site: https://github.com/crev-dev/cargo-crev/issues/264#issuecomment-612860278

bestia-dev commented 4 years ago

I need some "statistics" to see how many reviews or crates or authors are there. The first one is group_by_crate: https://bestia.dev/cargo_crev_web/info/group_by_crate/

image

dpc commented 4 years ago

@LucianoBestia Is it deduplicating by author overwriting review of the same version, etc.?

bestia-dev commented 4 years ago

Yes. I am counting distinct unique authors per crate. So I deduplicate . My next step is to make a list group by authors. And then i will add more trusted authors. So I will be able to compare the numbers before and after. To make it run faster I will cache an index of the proof data in memory. And access the disk only for longer data (text). I expect better performance, but more importantly less work for the server.

On Sat, 16 May 2020, 00:05 Dawid Ciężarkiewicz, notifications@github.com wrote:

@LucianoBestia https://github.com/LucianoBestia Is it deduplicating by author overwriting review of the same version, etc.?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/crev-dev/cargo-crev/issues/334#issuecomment-629414904, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHQM3TMKWNQQRNS4O7QVU5TRRWDQFANCNFSM4MT55NFQ .

dpc commented 4 years ago

Now that we've extracted crev-wot crate, you could consider importing crev-data for parsing, and using crev-wot for handling indexing etc.

bestia-dev commented 4 years ago

I am new to Rust and I need some more time to learn how to use different types, iterators, serde, traits, how to use modules and impl. I have a lot of work. And I need to start from the basics to learn. Later I will switch to your library. But now is too soon. I need some more experimenting to do. I added a page for group by author and reviews by author. Now that I have some "statistics" I can compare the numbers before and after. Next I will add all reviewers I can find.

image

image

bestia-dev commented 4 years ago

I added the cargo crev commands to fetch and trust an author. I think it is a good place for that.

https://bestia.dev/cargo_crev_web/author/FYlr8YoYGVvDwHQxqEIs89reKKDy-oWisoO0qXXEfHE/

If I made a typo, please correct me.

bestia-dev commented 4 years ago

Now I have this reviews on the cargo_crev_web: 2020-05-27 authors:55 crates: 514, reviews: 886

If you have more, I would like to know and to add them.

bestia-dev commented 4 years ago

I added badges to my web service. For starter I return the number of all existing crev reviews. It is pretty difficult to come up with just one number for ratings. Because there are multiple versions. Eventually we will get there too.

image

See it working here: https://github.com/LucianoBestia/reader_for_microxml

The markdown code is:

[![crev_count](
https://bestia.dev/cargo_crev_web/badge/crev_count/reader_for_microxml.svg
)](https://bestia.dev/cargo_crev_web/crate/reader_for_microxml)
dpc commented 4 years ago

Yeah. I don't even think it's very helpful to try to come up with a rating. "Reviewed X times with crev", is probably much more useful and objective metric.

Great feature!

bestia-dev commented 4 years ago

Maybe it would be time to use your domain name? So users will not get confused. The basic functionality is already usable. For some time my small server will suffice. When there will be more trafic we just copy it to another vm. Do you have experience how to point your domain name to my ip?

bestia-dev commented 4 years ago

I suggest this url: https://crev.dev/cargo_crev_web/

bestia-dev commented 4 years ago

I had my domain from porkbun. This is the settings: Screenshot_20200531-155040_Chrome

bestia-dev commented 4 years ago

Can we continue with issues on cargo_crev_web github? https://github.com/LucianoBestia/cargo_crev_web/issues/1#issue-628162323

And we close this one to not go long with very different topics?

dpc commented 4 years ago

I subscribed to the issue you've pointed to. I might need some time, but I'd like to point a cname/a/ns record of web.crev.dev.

bestia-dev commented 4 years ago

Great ! I love subdomains. They give a lot of freedom to have separate servers and applications in the background. I hope the repeating of terms is not bothering you. I am more comfortable to have also a separate sub-directory (for freedom of future use cases). https://web.crev.dev/cargo_crev_web/ Nobody will really type this url manually. There will always be some link or search involved.

bestia-dev commented 2 years ago

Can you, please forward http(s)://crev.dev to https://web.crev.dev ? Now if I write only crev.dev in my browser address I got "This site can’t be reached" and this is not nice for users. I know about your plans to put some other info there, but until it is ready, I would like to see a simple forward or redirect. Thanks.