JuliaLang / PrecompileTools.jl

Reduce time-to-first-execution of Julia code
MIT License
204 stars 11 forks source link

Catch errors during precompilation #21

Closed mkitti closed 3 months ago

mkitti commented 1 year ago

Here we catch errors during precompilation so that the user still has the possibility of loading the package for diagnostics.

timholy commented 1 year ago

I'll link to https://discourse.julialang.org/t/vscode-julia-server-fails-precompilation/98865/7?u=tim.holy as a discussion-starter, but I'm happy to keep chatting about it here. Thanks for taking an interest in this even if we decide not to merge this!

timholy commented 1 year ago

I saw your response in discourse. Since that discussion has become a mix of user-level assistantance and design questions about PrecompileTools, I'd rather keep the latter here. Copying out a key portion of your reply:

Packages should not fail during precompilation or loading, especially if the failure is not critical to the function a package. The consequences for failing are severe since all packages which depend on the failing package will also fail to load. A package which does not load at all is also much more difficult to debug.

I do see your point here. However, I'm not sure I entirely agree, for two reasons:

Catching errors during @compile_workload, reporting them, and resolving them may require repetitive non-trivial code...

The vast majority of packages don't have errors in their precompile workloads. I'm a bit unsure about how effective a generic "respond to Preferences errors" would be (wouldn't it depend on the specific preference option?). Requiring that the developer add a try/catch also gives them a chance to respond specifically to problems.

All that said, I'm not saying I think this decides it; I'm still willing to be convinced, but I'm not there yet.

mkitti commented 1 year ago

No package should have errors during their precompilation process. Sometimes it happens though.

If I want my package to have a robust loading process, then this suggests I should execute the bare minimum for loading. However, if I want to lower compilation latency for my package ,then I need to execute more code at the risk that executing that workload could fail and prevent my package from loading. This is a difficult choice. We should make it easier for package developers to lower complilation latency without taking on the risk.

Say I'm a new package developer, and I take the risk of executing a workflow for precompilation. I throw a try / catch around my @compile_workload, then what? If I just report it to the terminal, precompilation succeeds. There are no subsequent warnings that something went wrong during precompilation. There is also no method for the user to force precompilation of the package again.

Thus I build a custom system for my package to determine if precompilation succeeded well. I decide to store the error in a global constant in my module that gets stored in the precompiled cache. I display a warning if precompilation failed, redisplaying the error. The warning provides instructions to delete the cache file to force precompilation again. Maybe the issue was just a network error that automatically resolves itself.

Then John Doe builds another custom system for their package to determine if precompilation succeeded. John decided he will take a different approach. He will print the error to a Julia file in the deps folder. On __init__ he includes that file in and displays the error. To force precompilation, John tells the user to delete the error file in the deps folder.

Jane Doe tries to use both my package and John's package. She's having problems with both packages. She sees two different ways to force precompilation again and two different reporting mechanisms. Jane wonders why there is not some common package with Tools to make the Precompilation debugging process consistent. Maybe that package should be PrecompileTools.jl.

mkitti commented 1 year ago

More succinctly and in code, what is the recommended try / catch pattern?

try
    @setup_workload begin
        # Setup
        @compile_workload begin
            try
            # main workload
            catch err
               # catch expected errors here
            end
        end
    end
catch err
    bt = catch_backtrace()
    @error "An unexpected error ocurred during precompilation" exception=(err,bt)

    # If we rethrow the error, we're more likely to catch the error during development.
    # This has the benefit that precompilation will be triggered anew when the user tries to import the package.
    # If the error persists though, they cannot load the package at all.
    if developing
        rethrow(err)
    end

    # Once released into the wild, we may want to just let the package attempt to continue
    # even if there is a precompilation error.
    # Perhaps if we allow the user to continue, they can modify the environment, or insert methods to use the package.
end

...

def __init__()
   # generate an annoying warning if precompilation failed
end
timholy commented 1 year ago

I would use the "inner" one, the try/catch inside @compile_workload and have it issue a warning in case of error. E.g.,

try
   run_some_code()
catch err
    @error "Errorred while running the workload, the package may work (or may not) but latency will be long" exeption=(err,catch_backtrace())
end

That seems pretty easy? I don't currently see the use case for the more complicated version, but it probably depends on what kinds of applications one is thinking about.

mkitti commented 1 year ago
  1. How would users detect if an error occurred during precompilation of a particular package after the initial precompilation?
  2. How do users force precompilation of a particular package after seeing that an error occurred?

An error message that occurs during parallel precompilation could be easily missed and ignored. If they then show in one of our community channels, do we have a way of diagnosing precompilation problems after the fact? Currently, my best recommendation would be to wipe out the DEPOT_PATH[1]/.compiled/v1.9 directory.

timholy commented 1 year ago

(On phone, traveling next few days) If you say precompiled in an env and it fails and you don't notice, then you say using pkgname and it hasn't precompiled, it will start recompiling at the failure point. Then you see the error and it fails to load. Seems pretty clear to me, but maybe we do things with different workflows?

Edit: I guess it won't fail if you make the error nonfatal. But we already report stderr msgs at the end of parallel precompilation.

On 2, I think Pkg.build("pkgname") works but not sure. Pkg.precompile may also take an arg now, can't remember.

mkitti commented 1 year ago

I created an issue about a force option for precompile.

https://github.com/JuliaLang/Pkg.jl/issues/3483

I did find one way to force precompilation:

force_precompile(package_name::AbstractString) = Base.compilecache(Base.identify_package(String(s)))

Pkg.build does not force recompilation.

mkitti commented 1 year ago

Edit: I guess it won't fail if you make the error nonfatal. But we already report stderr msgs at the end of parallel precompilation.

If one ignored the the nonfatal error (a warning?), how would one detect the occurence of a nonfatal error in subsequent Julia sessions when one notices that TTFX is less than desired?

codecov[bot] commented 1 year ago

Codecov Report

Attention: Patch coverage is 55.17241% with 13 lines in your changes are missing coverage. Please review.

Project coverage is 84.07%. Comparing base (b4debd8) to head (ab6ad7d). Report is 9 commits behind head on main.

Files Patch % Lines
src/workloads.jl 55.17% 13 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #21 +/- ## ========================================== - Coverage 93.18% 84.07% -9.12% ========================================== Files 3 3 Lines 88 113 +25 ========================================== + Hits 82 95 +13 - Misses 6 18 +12 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

timholy commented 1 year ago

We don't have a mechanism. If one really wanted to do this, presumably you'd have to write a flag to a file and then check that file during __init__. That would slow down package loading for all users of the package, so I doubt that's something we'd want to do.

We also don't warn people that the inferrability of their methods is terrible, or that they're using an O(N^2) algorithm instead of an O(N) algorithm, etc.

timholy commented 1 year ago

OK to close or would you rather continue the discussion?