timholy / SnoopCompile.jl

Provide insights about latency (TTFX) for Julia packages
https://timholy.github.io/SnoopCompile.jl/dev/
Other
309 stars 48 forks source link

Rethink how we use Preferences #356

Closed oxinabox closed 1 year ago

oxinabox commented 1 year ago

doing set_preferences!(SnoopPrecompile, "skip_precompile" => ["MyTopLevelPackage",]) invalidates a ton of precompilation. About 40 deps on my current project. Even though it seems like it shouldn't since i only disabled it for the top-level project, not for any of its dependencies. So it should have been 1 not 40. And the whole point was to spend less time precompiling

According to @KristofferC this is because we invalidate every package that uses that preference. And that basically the same as every package that depends on SnoopPrecompile. And since invalidation is transitive: right now that is: 3,828 packages out of the current total 9238 registered packages. If we exclude jlls which have no depencencies it is out of 8004 packages. So almost half of all packages.

This means basically if you touch that setting you are going to be redoing a lot of precompilation, half as bad as installing a new julia version.

A possibility might be that we can have a setting per package. So it would be set_preferences!(MyTopLevelPackage, "skip_snooped_precompile" => true

timholy commented 1 year ago

Hmm, you're right that as more and more packages use it, the "escape" I thought we had in https://discourse.julialang.org/t/psa-for-snoopprecompile-turning-off-extra-workload-for-specific-packages/92865/5?u=tim.holy doesn't really scale.

We should change this while also moving SnoopPrecompile to JuliaLang and possibly renaming it.

serenity4 commented 1 year ago

With a package preference, you would still need to re-precompile a package, even though the intention was to specify that you don't want it to run the precompilation workload in future invalidations. To avoid this, we could use an environment variable instead, e.g. JULIA_SKIP_SNOOPED_PRECOMPILE=PackageA,PackageB with a coma-separated list of packages. One downside however is that re-enabling the run of precompilation workloads for a package will not cause the package to re-precompile. Would we have a way to manually invalidate a package and retrigger precompilation of affected packages in a project?

timholy commented 1 year ago

The whole point of using Preferences is to avoid the myriad problems that come from using an environment variable. In the worst case you can get segfaults if the compilation status of the system is inconsistent.

serenity4 commented 1 year ago

I see the value of Preferences in the general case, but here might we not assume that a package's logic should be unaffected by whether the precompilation workload is run or not? From my perspective, the related code would only control the amount of things that end up in the package image. That is based on the assumption that no methods, types or globals are defined in the @precompile_all_calls block. Under this assumption, are the potential issues you mention still likely to happen?

Whether or not we'd like to make such an assumption would be another question.

In any case, having a preference per-package would be a clear improvement over the status quo.

timholy commented 1 year ago

Starting with 1.9, precompilation & package loading basically work like a (very flexible) C linker. That's how we handle the native code. If you expect to be able to link against things that are already precompiled (because they were available when you compiled the package), and then those objects disappear behind your back, bad things happen. Consistency across the entire set of dependencies is paramount, and there's no way to guarantee consistency if environment variables control what does and does not get precompiled.

serenity4 commented 1 year ago

I see, though I wonder if that situation would happen in practice. The only way I see is if the build of a whole project somehow relies on two independent builds of that same package, which produce different images; but otherwise, wouldn't the state of the compiled code for a given package be read directly from its package image after precompilation? Given that the effect of such an environment variable would always be tied to an invalidation of the whole chain of dependency downstream, I do not see how objects would be able to vanish without triggering recompilation of dependent code.

timholy commented 1 year ago

I'm reasonably sure it can happen (we did get segfaults before we introduced this feature when people were rolling their own). I think it can come from the following combination of factors:

I think this is all you need to have bad things happen in practice: say A gets used by B & C; B gets precompiled without A being precompiled, but C gets precompiled with A being precompiled. If you load B before C, boom.

timholy commented 1 year ago

Fixed by PrecompileTools