JuliaLang / PackageCompiler.jl

Compile your Julia Package
https://julialang.github.io/PackageCompiler.jl/dev/
MIT License
1.4k stars 186 forks source link

RFC: Modularizing PackageCompiler.jl #858

Open sloede opened 9 months ago

sloede commented 9 months ago

PackageCompiler.jl is a great tool and IMHO a vital part of the journey towards making Julia more universally deployable. At the moment, there are three main entry points,

serving the three main purposes of creating sysimages for reduced latency, standalone apps that can be deployed without a Julia installation, and standalone libraries (also independently deployable).

Over time, these three functions have tremendously grown in capabilities, which is reflected by the huge number of arguments they take. Besides being somewhat unwieldy and not overly "Julian", it also means that it is hard to integrate PackageCompiler.jl builds into more complex build workflows that use, e.g., CMake.

I've been pondering this for a while now, and I believe there might be a solution to this: By decomposing these three main functions into individual, independent parts, using Julia's type system, we could make the individual steps of the build process more composable. This would allow users to make their builds more flexible and hopefully opening up some potential for caching intermediate results.

From an initial survey of the current code, I could imagine creating the following types, each representing one part of the build step (names TBD):

The idea would be that for, e.g., a library, I would

My goal is that with such a more modular approach, we can then go ahead and think about caching intermediate results. For example, if we hashed the arguments + Julia version to the current create_fresh_base_sysimage (which is essentially a list of strings), we could skip re-generating the base sysimage during each build. Similarly, it would allow me to not having to rebuild sysimage_obj_file if I just want to add or modify the C files with the initialization functions.

I am probably missing something (e.g., maybe we need a Config or Context object to pass information around that is needed in multiple places, such as project paths and cache directories), but hopefully this can serve as a starting point for a discussion on whether a) such an approach is feasible, b) it is desirable, and c) ultimately whether there are maybe better ways to achieve the desired goals.

Comments/suggestions/hole poking welcome 🙂

KristofferC commented 7 months ago

The code restructuring sounds like a good idea to me but I don't see how that helps with the fact that the entry points we have (create_XXX) have a lot of options. To me, this looks more like internal code refactoring, not something directly user facing?