m3g / CellListMap.jl

Flexible implementation of cell lists to map the calculations of particle-pair dependent functions, such as forces, energies, neighbor lists, etc.
https://m3g.github.io/CellListMap.jl/
MIT License
87 stars 4 forks source link

Better handling when `NaN` occurs in a position #105

Closed KristofferC closed 1 month ago

KristofferC commented 1 month ago

Consider for example

julia> x = [[1.0, 2.0], [2.0, NaN]];

julia> system = CellListMap.Box(limits(x), 0.2)
ERROR: InexactError: trunc(Int64, NaN)
Stacktrace:
  [1] trunc
    @ ./float.jl:905 [inlined]
  [2] floor
    @ ./float.jl:383 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/StaticArrays/EHHaF/src/broadcast.jl:135 [inlined]
  [4] __broadcast
    @ ~/.julia/packages/StaticArrays/EHHaF/src/broadcast.jl:123 [inlined]
  [5] _broadcast
    @ ~/.julia/packages/StaticArrays/EHHaF/src/broadcast.jl:119 [inlined]
  [6] copy
    @ ~/.julia/packages/StaticArrays/EHHaF/src/broadcast.jl:60 [inlined]
  [7] materialize
    @ ./broadcast.jl:903 [inlined]
  [8] _compute_nc_and_cell_size(::Type{…}, xmin::SVector{…}, xmax::SVector{…}, cutoff::Float64, lcell::Int64)
    @ CellListMap ~/.julia/packages/CellListMap/jSOxW/src/Box.jl:213
  [9] _construct_box(input_unit_cell::CellListMap.UnitCell{NonPeriodicCell, 2, Float64, 4}, lcell::Int64, cutoff::Float64)
    @ CellListMap ~/.julia/packages/CellListMap/jSOxW/src/Box.jl:254

Since (due to e.g. some bug in the updates for a particle system) it isn't too implausible to end up with a NaN it might be worth handling this a bit more gracefully (by giving a better error message)

jgreener64 commented 1 month ago

This would be quite useful as I see this error a lot (my own fault).

lmiq commented 1 month ago

Is there any fancier way to check that than a just looping over the coordinates before doing anything else? I'm thinking of providing a validate_coordinates kw option, which is nothing or a function, where the default is a function like:

function _validate_coordinates(x::Vector{<:SVector})
    for (i,v) in enumerate(x), c in v
        if any(isnan, c) || any(ismissing, c)
            throw(ArgumentError("Invalid coordinates for particle $i: $c"))
        end
    end
    return nothing
end

That way the user could provide custom validation functions as well.

Any better idea?

jgreener64 commented 1 month ago

Seems sensible, unless I misunderstand the example I think you are looping one too many times though.

lmiq commented 1 month ago

Yes, of course, I added the use of any and forgot to remove the rest:

function _validate_coordinates(x::Vector{<:SVector})
    for (i,v) in enumerate(x)
        if any(isnan, v) || any(ismissing, v)
            throw(ArgumentError("Invalid coordinates for particle $i: $v"))
        end
    end
    return nothing
end

But the point is if we need absolutely to check the coordinates in a separate loop, or if there is a way to overload the current error message without additional cost.

jgreener64 commented 1 month ago

Knowing which coordinate is NaN is useful but not essential, so another option would be to catch and rethrow the current error with a different message or catch and run the above loop.

lmiq commented 1 month ago

catch with a try - catch I think is out of question. That's too expensive. I don't know if there are alternatives.

jgreener64 commented 1 month ago

Isn't try very fast in the success case? And performance doesn't matter in the error case. I haven't actually tried it I should add.

lmiq commented 1 month ago

Well, it doubles the time for this:

julia> using StaticArrays

julia> function test!(x,y)
           for i in eachindex(x)
               y[i] = round.(Int,x[i])
           end
       end
test! (generic function with 2 methods)

julia> function test_try_catch!(x,y)
           for i in eachindex(x)
               try 
                   y[i] = round.(Int,x[i])
               catch
                   println("error")
               end
           end
       end
test_try_catch! (generic function with 1 method)

julia> x = rand(SVector{3,Float64}, 10^4);

julia> y = zeros(SVector{3,Int}, 10^4);

julia> @btime test!($x, $y)
  28.468 μs (0 allocations: 0 bytes)

julia> @btime test_try_catch!($x, $y)
  56.655 μs (0 allocations: 0 bytes)
lmiq commented 1 month ago

Validate the coordinates is much faster:

julia> function _validate_coordinates(x)
           for (i,v) in enumerate(x)
               if any(isnan, v) || any(ismissing, v)
                   error()
               end
           end
       end
_validate_coordinates (generic function with 1 method)

julia> @btime _validate_coordinates($x)
  5.508 μs (0 allocations: 0 bytes)
jgreener64 commented 1 month ago

Fair enough, makes sense to do it before then.

lmiq commented 1 month ago

In 0.9.6 a custom error message will the thrown:

julia> x = [[1.0, 2.0], [2.0, NaN]];

julia> system = CellListMap.Box(limits(x), 0.2)
ERROR: ArgumentError: 

    Invalid coordinates found: [2.0, NaN] for particle of index 2.

Stacktrace:
 [1] _validate_coordinates(x::Vector{Vector{Float64}})
   @ CellListMap ~/.julia/dev/CellListMap/src/CellOperations.jl:10
 [2] limits(x::Vector{Vector{Float64}}; validate_coordinates::typeof(CellListMap._validate_coordinates))
   @ CellListMap ~/.julia/dev/CellListMap/src/CellOperations.jl:383
 [3] limits(x::Vector{Vector{Float64}})
   @ CellListMap ~/.julia/dev/CellListMap/src/CellOperations.jl:382
 [4] top-level scope
   @ REPL[4]:1

julia> 

And the same error is thrown in a lists are updated, or generated with improper coordinates.