InPlaceNeighborList() result is incorrect

ArrogantGao commented 1 year ago

I found that after updating the InPlaceNeighborList() for serval steps, the result of neighbor!(system) can be different from that of
neighbor(x_new)

using CellListMap, Test
x = rand(SVector{3,Float64}, 10^3);
system = InPlaceNeighborList(x=x, cutoff=0.1, unitcell=[1,1,1], parallel=false)

list = neighborlist!(system)

x_new = rand(SVector{3,Float64}, 10^3);
for _=1:1000
    x_new = rand(SVector{3,Float64}, 10^3);
    update!(system, x_new)
end

@test neighborlist!(system) == neighborlist(x_new, 0.1; unitcell = [1, 1, 1])

I think that may because the change in size of the neighborlist.

lmiq commented 1 year ago

It may be only the order and/or some floating point accuracy detail that is adding or removing some elements from the list, but the issue is related to the last list being computed in parallel vs. the others in serial.

This test, after your code, passes:

julia> @test neighborlist!(system) == neighborlist(x_new, 0.1; unitcell = [1, 1, 1], parallel=false)
Test Passed

I will be further investigating this.

lmiq commented 1 year ago

I think that, effectivelly, it is just a floating accuracy detail:


julia> x_new = rand(SVector{3,Float64}, 10^3);

julia> for _=1:1000
           global x_new = rand(SVector{3,Float64}, 10^3);
           update!(system, x_new)
       end

julia> list1 = copy(neighborlist!(system));

julia> list2 = copy(neighborlist(x_new, 0.1; unitcell = [1, 1, 1]));

julia> @test neighborlist!(system) == neighborlist(x_new, 0.1; unitcell = [1, 1, 1], parallel=false)
Test Passed

julia> list1 == list2
false

julia> sum(el -> el[3], list1)
159.37434414155484

julia> sum(el -> el[3], list2)
159.37434414155481

lmiq commented 1 year ago

Ok. I think there is no error, effectively.

Yet, the equality does not hold for the possible reasons: 1) the lists are in different order; 2) there are pairs for which the distances are slightly different because of floating point rounding; 3) the same as 2, but when the distance between points is close to the cutoff, which may cause one list to contain a pair and the other don't.

All these differences do not occur if both runs are executed in serial. In parallel you can get all those differences, or when comparing a parallel to a serial run.

Anyway, I wrote a function to check if the lists are equivalent, or not. It will be included as a testing function in the package, but here it is:

# Test function to check if a pair in one list is unique in other list
function _list_match_unique(lists_match, pair1, index_pair2, list2, cutoff; io, atol, first = "first", second = "second")
    # check if pair is unique
    if length(index_pair2) > 1
        println(io, "Non-unique pair: ", join((pair1, list2[index_pair2]), " "))
        println(io, "    On $first list: ", pair1)
        println(io, "    On $second list: ", list2[index_pair2])
        lists_match = false
    end
    # check if missing pair is at the cutof distance
    if length(index_pair2) == 0
        if !(isapprox(pair1[3] - cutoff, 0, atol=atol))
            println(io, "Pair of $first list not found in $second list: ", pair1)
            lists_match = false
        else
            println(io, "Warning: pair of $first list not found in $second list with d ≈ r: ", pair1)
        end
    end
    return lists_match
end
function lists_match(
    list1::Vector{Tuple{Int,Int,T1}},
    list2::Vector{Tuple{Int,Int,T2}},
    cutoff::Real;
    verbose::Bool=false,
    atol::Real=1e-10
) where {T1 <: Real, T2 <: Real}
    io = verbose ? stdout : devnull
    lists_match = true
    for pair1 in list1
        index_pair2 = findall(
            p -> ((p[1],p[2]) == (pair1[1],pair1[2])) | ((p[1],p[2]) == (pair1[2],pair1[1])), list2
        )
        # Check for uniqueness of pairs
        lists_match = _list_match_unique(lists_match, pair1, index_pair2, list2, cutoff; io, atol)
        # If pair is unique, check if distances match
        if length(index_pair2) == 1
            pair2 = list2[index_pair2[1]]
            # check if distances match
            if !isapprox(pair1[3]-pair2[3], 0, atol=atol)
                println(io, "Warning: distances do not match: ", pair1, pair2)
                lists_match = false
            end
        end
    end
    for pair2 in list2
        index_pair1 = findall(
            p -> ((p[1],p[2]) == (pair2[1],pair2[2])) | ((p[1],p[2]) == (pair2[2],pair2[1])), list1
        )
        # Check for uniqueness of pairs
        lists_match = _list_match_unique(lists_match, pair2, index_pair1, list1, cutoff; io, atol, first="second", second="first")
    end
    return lists_match
end

Then you can run your example with:

using CellListMap, Test, StaticArrays

x = rand(SVector{3,Float64}, 10^3);
system = InPlaceNeighborList(x=x, cutoff=0.1, unitcell=[1,1,1], parallel=false)

list = neighborlist!(system)

x_new = rand(SVector{3,Float64}, 10^3);
for _=1:1000
    x_new = rand(SVector{3,Float64}, 10^3);
    update!(system, x_new)
end

# copy the function above here 

# check equivalence of the lists:
@test lists_match(neighborlist!(system),neighborlist(x_new, 0.1; unitcell = [1, 1, 1], parallel=false), 0.1; verbose=true)

The above test should pass.

Please let me know if there is anything you still think is wrong. I'll close the issue for now.

The function above will be available as an internal testing function with name

CellListMap.TestingNeighborLists.lists_match

ArrogantGao commented 1 year ago

Hello! Thank you very much for your reply, I made some further tests using the function you supplied, and I found that the problem still exists, here is the test code I used:

using CellListMap, Test, StaticArrays

x = rand(SVector{3,Float64}, 10^3);
system = InPlaceNeighborList(x=x, cutoff=0.1, unitcell=[1,1,1], parallel=false)

list = neighborlist!(system)

x_new = rand(SVector{3,Float64}, 10^3);
for _=1:10000
    x_new = rand(SVector{3,Float64}, 10^3);
    update!(system, x_new)
end

list_1 = neighborlist!(system)
list_2 = neighborlist(x_new, 0.1; unitcell = [1, 1, 1], parallel=false)

@test CellListMap.TestingNeighborLists.lists_match(list_1, list_2, 0.1; verbose=true)

result of this test is not stable, a failed case is shown in the figure below:

It seems that when updating the system, the list buffer can only grow larger, which is incorrect in some cases if the size of the buffer is larger than that of the actual neighbor list. I understand the aim of this strategy is to avoid allocations so that we should not decrease the buffer size, but I think a possible way to solve the problem is to set all these empty buffers to (0, 0, 0.0).

lmiq commented 1 year ago

Ok, this is a MWE:

julia> x = [ SVector{3,Float64}(0,0,0), SVector{3,Float64}(0,0,0.05) ];

julia> system = InPlaceNeighborList(x=x, cutoff=0.1, unitcell=[1,1,1], parallel=false)
InPlaceNeighborList with types: 
CellList{3, Float64}
Box{OrthorhombicCell, 3, Float64, Float64, 9, Float64}
Current list buffer size: 0

julia> list0 = neighborlist!(system) # correct
1-element Vector{Tuple{Int64, Int64, Float64}}:
 (1, 2, 0.05)

julia> xnew = [ SVector{3,Float64}(0,0,0), SVector{3,Float64}(0,0,0.2) ];

julia> update!(system, xnew)
InPlaceNeighborList with types: 
CellList{3, Float64}
Box{OrthorhombicCell, 3, Float64, Float64, 9, Float64}
Current list buffer size: 1

julia> list1 = neighborlist!(system) # wrong
1-element Vector{Tuple{Int64, Int64, Float64}}:
 (1, 2, 0.05)

The issue is only with non-parallel runs. For parallel runs I was already resizing the lists in the reduction function. I'll fix that now.

lmiq commented 1 year ago

I released the easy fix now (will be available in 0.8.20 at any moment). The fix involves resizing the list at the end of each calculation, which may be sub-optimal for successive in-place computations, although resizing of arrays in Julia is smart in that sense, and the buffers won't get cleared all the time.

This is the fix: https://github.com/m3g/CellListMap.jl/commit/9f10f7c44bcc3810e123f4a48f42ec962d0e8810

ArrogantGao commented 11 months ago

Hi, I found the problem seems not fully fixed for 2D cases. I think the reason should be similar, and here is a MWE:

julia> using CellListMap

julia> using StaticArrays

julia> x = [ SVector{2,Float64}(0,0), SVector{2,Float64}(0,0.05) ];

julia> system = InPlaceNeighborList(x=x, cutoff=0.1, unitcell=[1,1], parallel=false)
InPlaceNeighborList with types: 
CellList{2, Float64}
Box{OrthorhombicCell, 2, Float64, Float64, 4, Float64}
Current list buffer size: 0

julia> list0 = neighborlist!(system)
1-element Vector{Tuple{Int64, Int64, Float64}}:
 (1, 2, 0.05)

julia> xnew = [ SVector{2,Float64}(0,0), SVector{2,Float64}(0,2.0)];

julia> update!(system, xnew)
InPlaceNeighborList with types: 
CellList{2, Float64}
Box{OrthorhombicCell, 2, Float64, Float64, 4, Float64}
Current list buffer size: 1

julia> list1 = neighborlist!(system) # wrong
1-element Vector{Tuple{Int64, Int64, Float64}}:
 (1, 2, 0.0)

lmiq commented 11 months ago

That specific example seems correct to me. If the sides are [1,1], the particles with coordinates [0,0] and [0,2] share the same position in the lattice.

ArrogantGao commented 11 months ago

Thank you very much for your reply, sorry for the bothering.

lmiq commented 11 months ago

Not at all. I really appreciate any feedback. Please let me know when you find any difficulties.

m3g / CellListMap.jl

InPlaceNeighborList() result is incorrect #86