Precompilation errors - Githubissues

palday commented 1 year ago

I occasionally encounter a weird error when running precompilation, which once (see edit history in linked comment) even caused my package registration to fail.

This is the error:

ERROR: AssertionError: i ∈ iallowed || iszero(A[i, j])
Stacktrace:
 [1] _check_locality(A::StaticArraysCore.SMatrix{4, 2, Float64, 8}, corner::Symbol, allowed_nonzeros_per_column::Int64)
   @ BSplineKit.Recombinations ~/Code/BSplineKit.jl/src/Recombinations/matrices.jl:141
 [2] _RecombineMatrix
   @ ~/Code/BSplineKit.jl/src/Recombinations/matrices.jl:103 [inlined]
 [3] _RecombineMatrix
   @ ~/Code/BSplineKit.jl/src/Recombinations/matrices.jl:81 [inlined]
 [4] RecombineMatrix
   @ ~/Code/BSplineKit.jl/src/Recombinations/matrices.jl:183 [inlined]
 [5] RecombineMatrix
   @ ~/Code/BSplineKit.jl/src/Recombinations/matrices.jl:172 [inlined]
 [6] RecombinedBSplineBasis
   @ ~/Code/BSplineKit.jl/src/Recombinations/bases.jl:222 [inlined]
 [7] RecombinedBSplineBasis
   @ ~/Code/BSplineKit.jl/src/Recombinations/bases.jl:229 [inlined]
 [8] interpolate(x::Vector{Float64}, y::Vector{Float64}, k::BSplineOrder{6}, bc::Natural)
   @ BSplineKit.SplineInterpolations ~/Code/BSplineKit.jl/src/SplineInterpolations/SplineInterpolations.jl:274
 [9] top-level scope
   @ ./REPL[14]:8

I'm guessing that I don't see it consistently because there is a call to rand without a set seed in the precompilation setup. Stepping through, I got it with these values:

julia> xdata = sort!(rand(10))
10-element Vector{Float64}:
 0.1825841576261027
 0.21745398548214934
 0.2574540006430478
 0.37073734805196445
 0.3949709205453995
 0.41683805923570205
 0.4669292800550542
 0.5484125029182643
 0.8956857209308366
 0.9088143561930421

julia> ydata = randn(10)
10-element Vector{Float64}:
 -0.8662273184639824
  1.4644999406051766
 -2.4972734631605986
 -0.579356630251197
  0.9412609863223902
  0.843508283506621
  0.798971854937239
  3.0575214498226995
 -0.05574466652240026
 -0.10031429900815753

With my next call to randn, I got lucky and didn't hit those values and everything worked.

So I think there are two things that should be done:

set up the calls to randn to have a set seed (I can do this in a PR if desired)
find out why the assertion is failing for some values

(1) will fix things for now, but given that the exact stream of floats produced by even explicitly using say MersenneTwister(42) can change between Julia releases, (2) is necessary for long-term guarantees.

For completeness, this is the setup I did this test on:

julia> versioninfo()
Julia Version 1.9.1
Commit 147bdf428cd (2023-06-07 08:27 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 8 × Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 4 on 4 virtual cores
Environment:
  JULIA_EDITOR = code

But I'm also pretty sure I've hit this problem on x86-64 Linux.

jipolanco commented 1 year ago

Thanks for catching that! I have to admit that I encountered the same issue at some point, but I couldn't reproduce it and didn't have the time to find an explanation.

It should be much easier to identify the actual issue using the values you provided. At least with these inputs, the error seems to come from the interpolation using Natural boundary conditions with 6-th order B-splines.

I completely agree that the precompilation code should set the random seed for robustness, even though the code is supposed to work for all random inputs... And I think it's OK to just use something like MersenneTwister(42). If the random number generations changes in a future Julia version, this will be easily detected in the tests, and the seed can be changed accordingly. Alternatively, there is StableRNGs.jl, but I don't think it's worth it to add it as a dependency.

So, as long as we don't have a solution for the actual problem, I agree that we should at least fix the precompilation issues by setting the random seed as you suggested. Feel free to make a PR for this.

jipolanco commented 1 year ago

For completeness, here is a minimal example reproducing the error:

using BSplineKit

xdata = [
    0.1825841576261027
    0.21745398548214934
    0.2574540006430478
    0.37073734805196445
    0.3949709205453995
    0.41683805923570205
    0.4669292800550542
    0.5484125029182643
    0.8956857209308366
    0.9088143561930421
]

ydata = [
   -0.8662273184639824
    1.4644999406051766
   -2.4972734631605986
   -0.579356630251197
    0.9412609863223902
    0.843508283506621
    0.798971854937239
    3.0575214498226995
   -0.05574466652240026
   -0.10031429900815753
]

interpolate(xdata, ydata, BSplineOrder(6), Natural())

palday commented 1 year ago

Great, I'll open a PR soon!

palday commented 1 year ago

Great, I'll open a PR soon!

jipolanco commented 1 year ago

So I think I identified the source of the issue. In short, when constructing the recombination matrix for representing natural boundary conditions, the following submatrix is obtained from the solution of a small linear system:

4×2 StaticArraysCore.SMatrix{4, 2, Float64, 8} with indices SOneTo(4)×SOneTo(2):                                                                                                                                  
 2.03502    -5.67941e-14                                                                                                                                                                                          
 0.932215    0.643651                                                                                                                                                                                             
 0.0327649   1.16861                                                                                                                                                                                              
 0.0         1.18774

The upper right value should be actually zero, and the assertion fails because here that's not the case. In fact, the implementation already allows some tolerance for near-zero values (see _remove_near_zeros), but the current tolerance is too strict. By setting rtol = 1000 * eps(eltype(A)) (instead of 100), the upper right value is replaced by a zero and the assertion doesn't fail any more.

palday commented 1 year ago

(Isn't floating point fun? :wink:)

jipolanco / BSplineKit.jl

Precompilation errors #71