Open Erithax opened 6 months ago
Thanks for the suggestion. Do I understand correctly that what you'd like to have is basically snapshots of the package.edition report of https://rust-digger.code-maven.com/msrv where instead of plain counting the crates in each "bucket" we would sum the number of downloads for each crate, right?
Right now Rust-Digger is updated once a day and is lagging after Crates.io by about 1.5 days. I am planning to reduce this to an hour or even less, but I have to make sure consuming the Crates.io API this way is fine. However even with a such a lag what you are suggestion might be interesting.
Another things that needs to be kept in mind is that right now Rust Digger looks at the latest version of each crate and as far as I know the download count is per version per crates. So if Crate X v1.07 switches to edition 2024 then we need to be aware of this and we need to sum up the downloads for all the version after 1.07.
Oh, I didn't know there was already a website. I'm a bit surprised to still see 1/3rd of crates on edition 2018, even recently updated ones (and some heavy hitters like Anyhow). I assumed that on edition release there would've been more of a push to move away from the old edition. So I guess hourly checks upon release will not be useful as it will probably move quite slowly. Still I think history graphs of edition and MSRV would be interesting.
About the weighting by downloads, I had something simpler in mind. Namely that you plot the current current (made-up code)
// (edition, crates on this edition)
(edition, crates.filter(|crate| crate.last_version.edition == edition).count())`
and also
// (edition, total number of downloads of crates on this edition / total number of crate downloads)
(
edition,
crates.filter(|crate| crate.last_version.edition == edition).map(|crate| crate.total_downloads()).sum() /
crates.map(|crate| crate.total_downloads()).sum()
)
So the crates are weighted by popularity (by way of total number of downloads), but there are definitely more 'correct' metrics, like what you suggested.
One day it would be really cool if when a crate wants to bump their MSRV or edition, they could check how many crates, active crates, and crates weighted by popularity are dependant on it and would have to bump their own MSRV and edition as a consequence. e.g. AnyHow wants to move to 2021 edition and rust version 1.X, how many dependants will have to bump their MSRV and/or edition? We probably do not care much for crates which haven't been updated in 4 years, and care more about crates with many downloads.
I think the historical data can be collected any time - to the limit that Crates.io itself keeps that - and it might require a lot of access to Crates.io to download the Cargo.toml file of each version of each crate. That's probably doable but will take a while to process.
You other suggestion to can be probably achieved by a page for each crate that shows all its reverse dependencies and what edition and what rust version they require.
Hi! I was thinking it would be cool to have a website with a graph displaying the evolution of the percentage of crates.io crates on edition 2024 and the same, but weighted by number of downloads.
I'd be down to make the website for this. Could rust-digger generate these metrics and how expensive would be updating these metrics? If it takes hours to update, this is probably not feasable.
I'm thinking hosting it on Github pages and just updating it manually by regenerating locally and comitting. First every hour, then every day, then every week until we wrap it up. (cfr. erithax.com which is hosted on Github pages and how I update the Github stars on there). Doing it automatically via Github action would be neat, but I don't know if that's free.
What do you think?