Closed ryscheng-mobile closed 4 days ago
The motivation for this is to get an initial sense of what the data looks like and possible complexity we need to tackle
OK, I think this is looking pretty solid.
Here is a sheet of top dependencies with some basic filters applied.
Here is the notebook used to generated the analysis. It includes some charts that I couldn't copy over here due to slow internet (✈ 🤕 )
Quick description of the methodology:
We start with the SBOM (Software Bill of Materials) for every project on OSO (2000+ projects). More than 80K dependencies are captured this way.
We drill down on the projects that are part of one or more of our OP collections (anything in a past RF round, as well as many other onchain projects and grant recipients). Total of 638 projects have at least one dependency. NPM and Rust are the clear favorites, with a little bit of Go and Python (PIP) still. Now we are down to around 50K dependencies.
Then, we can look specifically at the onchain projects. This gets us to 348 projects, of which NPM is still by far the most popular, with Rust a distant second. This is similar to what Faina predicted. This still leaves about 40K dependencies.
Finally, I added some simple filters to catch some of the common web2 packages that are not really relevant. We can refine this if necessary, but the end result is around ~8K packages with at least 3 dependents. If you check out the notebook, there's a scatterplot graph at the bottom showing the filtering technique. The projects that are above the line and farther to right are effectively "popular dependencies for onchain projects". For example, ethers
is in 90% of onchain projects and 79% of all OP projects, showing that it is hugely popular and more popular with onchain projects than other types of projects. Meanwhile, ipfs-pubsub-peer-monitor
is only in 3 projects, so much more niche.
See also #2364
@ccerv1 this is great! Some things I'm curious about
Some general callouts:
What's the path forward? I'd love to move fast in understanding what we can get out of dependency data.
Update @JSeiferth
I have joined the initial dependency list on projects that we already have in OSO in v1 here.
In this version, we can see the popularity of a variety of OpenZeppelin packages:
We can also see the OpenZeppelin contracts is near the top of the list in terms of onchain projects (ignore some of the web2 libraries like babel).
I'm going to close out this exploratory work and create a new issue that tracks some of the experimental metrics for dev tooling.
cc @Jabolol here was my mapping script for the SBOM <-> repo <-> project logic (I did this with lots of distractions (and flaky internet) last week but shows the basics of how to connect the data.)
What is it?
Jonas is asking for a preliminary list of top packages in the Optimism collective.
Just looking to sanity check the initial dataset we have, this can just be a CSV dump.