Debian packages that may need closer examination for bootstrappability
This repo contains scripts and package lists of packages that have some kind of build self-dependency, analyzed from Debian's dependency information.
First, there are 98 "essential" packages built from 67 source packages. These packages need to be installed on every Debian package, so they cannot participate in dependency resolution.
Next, building a package always needs an additional 40 "build-essential" packages installed, which are built from 11 source packages. Therefore, these packages cannot participate in dependency resolution either and have to be manually checked.
In addition, there are 41 source packages that directly build-depend on a package that is built by the package itself, so these are the most direct dependencies that should be manually checked.
The rest gets hairy. The only "accurate" way to find more dependency cycles I found is to resolve the build-dependencies for each package individually, translate the binary packages in those dependencies to source dependencies and analyze the graph. Debian's resolver is not the fastest, so I took a shortcut. I first used botch to create a graph that contains any package relationships (build-depends, is-built-from, as well as virtual package alternatives). This graph contains too many edges, but it can be used as an "upper limit". I repeatedly removed all sources and sinks, then created a list of remaining source package nodes (about 3000) and ran the resolver for each of them.
213 source packages need to have a package that is built by itself installed, without actually declaring that dependency directly.
There are 11 additional small strongly connected components (cycles) consisting of 34 source packages with cycle lengths of up to three.
But the rest of the resulting graph is a mess... It consists of only three strongly connected components, but their sizes are 16 (php related), 31 (ruby related) and 749 (!).
For the last one I was not even able to create an acceptable visualization (the best I could get is interactively exploring the graph with yEd), so I provide the raw data instead.
This text file contains all the three big components, not only the largest one.
The author of botch reached out and pointed out that in fact, botch also could have done that kind of dependency resolution, and they even provide a page with weekly dependency statistics.
More interesting details can be found in the issue itself.