use graph for build-sequence

shubhbapna commented 1 month ago

The graph provides us a more accurate recipe of what needs to be built and how. Currently we use build-order.json file and build packages from top down and for each package we try to construct the build dependencies again using the prepare_build_environment function. If we use the graph then we already have this information and won't need to reproduce the build dependencies set again. We also have the exact version that we can use for each requirement instead of relying on the constraints file while installing these build dependencies. This will fix #391

shubhbapna commented 1 month ago

Another advantage is that it can allow us to implement some parallelism (relates to #65) by picking packages smartly and then building them in parallel

rd4398 commented 1 month ago

+1 to using graph.json for build-sequence as it looks like it will improve the way we build wheels and make it more efficient. Regarding implementing parallelism, I am a bit unclear about how graph.json will help in picking packages smartly

shubhbapna commented 1 month ago

Regarding implementing parallelism, I am a bit unclear about how graph.json will help in picking packages smartly

We can use topological sorting to get an ordering of the packages which will also help us identify packages that are independent of each other. Those packages can be built parallely.

Additionally for certain heavy packages that take a lot of resources to build, we want to build them one at a time. We can store the time a package took to build in the graph as an indicator of a heavy resource package to ensure they are built one at a time.

Alternatively, we could also look for certain files in the sdist of the package that would indicate whether they are going to be resource intensive packages (suggested by @tiran)

python-wheel-build / fromager

use graph for build-sequence #439