Open lmiq opened 4 months ago
I'm not entirely surprised.
I was unable to run now some comparison to matscipy
. Do you think it is on the same ballpark there?
In this case CellListMap.neighborlist
also returns the list of pairs in memory (it isn't lazy either). The difference here is that it returns a list of Tuple{Int,Int,Float64}
with the indices and distances between the non-redundant pairs (that its, only for i > j
). Concerning the cell shape, there aren't restrictions on the cell shape here either (except, in fact, some related to the size of the cell relative to the cutoff, the cutoff must not be to large relative to the cell sizes).
oh, but now I see what probably mean: you can compute pairlists for periodic systems at multiple images (such for instance for crystals many cells away). Indeed, that CellListMap does not do, it is mostly thought for amorphous systems. That's different indeed.
Single-threaded this is what I get:
julia> Threads.nthreads()
1
julia> x, box = CellListMap.xatomic(10^4); # sample coordinates with liquid water atomic density
julia> using Chairmarks
julia> @b PairList($x, $(box.cutoff), $(box.input_unit_cell.matrix), (true, true, true))
1.453 s (80091 allocs: 635.597 MiB, 5.99% gc time, without a warmup)
julia> @b CellListMap.neighborlist($x, $(box.cutoff); unitcell=$(box.input_unit_cell.matrix))
208.961 ms (721 allocs: 197.067 MiB, 36.59% gc time)
Let's keep it open. I think that there is scope to improve things at my end, and I think there is scope to share code or at least an interface that selects your cell-list if min-image and mine if not. And maybe that can be merged into a single package. I have no strong views on any of this.
My list is slightly slower than the C++ implementation James Kermode.
Teemu and some others are talking about re-implementing neighbourlists from scratch. I don't have a strong view on this either.
My only perspective is that I hate code-duplication and maintaining multiple packages doing the same thing - especially when the community is as tiny as JuliaMolSim. Apart from that I'm open to anything.
Teemu and some others are talking about re-implementing neighbourlists from scratch. I don't have a strong view on this either.
Yes, exactly because of that I ended checking this. I'm interested in knowing what they are about (I guessed this was the implementation they were starting to develop). I have spent quite some time optimizing what's done in CellListMap because I need that for ComplexMixtures.jl, Packmol.jl, etc. CellListMap is more or less on the same ballpark as what NAMD does (but not scaling as effectively) on CPUs. I did not do anything in the sense of trying to compute these on GPUs, though. Anything that accelerates even further these calculations would be useful for me.
I probably won't be there at the meeting on 7 June, so here are a few comment:
NeighbourLists.jl
. @tjjarvinen and others are writing a new code from scratch that is supposedly GPU-friendly. I believe the gist is to only do the Cell list and then compute the actual neighbours lazily. EDIT : I don't think it would be too hard to multi-thread it. Somebody who knows what they are doing could probably do it in 1-2 days.
There is a need for GPU neighbourlist, by Molly team, CESMIX and we in Vancouver could use it too. This combined to that non of the current neighbourlist implementations work with GPUs, nor have their design taken into account GPU limitations (and strengths).
Also different potentials need to access neighbourlist differently and this causes some issue. We cannot e.g. use CellListMap to implement ACE potentials, as it is now, because of this.
The plan of the meeting is to go trough on what we would need for the neighbourlist, what limitations GPU bring to neighbourlist implementation and finally present an idea on how to implement performant GPU neighbourlist.
In practice it is an opportunity discuss how to collaborate on neighbourlist implementations and how to get started with GPU neighbourlist.
Also different potentials need to access neighbourlist differently and this causes some issue. We cannot e.g. use CellListMap to implement ACE potentials, as it is now, because of this.
I'll have to look into the code though and I suspect this is more than fixable...
how to collaborate on neighbourlist implementations
I think that part is important. Let's not go from maintaining 3 to 10 neighbourlists..
When implementing CellListMap I did some research but find close to none information on how they are implemented in the MD packages. I think one needs to somewhat go through their codes to understand what do. Amber has a particularly fast GPU implementation. I hope you have better luck and talent than me.
There is even less information on GPU neighbourlist implementations. This is the main one I have been looking https://doi.org/10.1063/5.0018516
To be true, upon reading that paper, it didn't become clear to me that they build the pair lists on the GPU.
That is true. The article (and the referenced articles) only mention some ideas about how to implement GPU neighbourlist. No true description on how to actually implement it. Thats why I said "even less information". But, atleast there are something.
My impression is that they don't build the pair lists on the GPU, but discuss the way to update the lists on the GPU. The initial pair list, I think, is only built once, on the CPU.
That approach (which is a sort of sophisticated implementation of Verlet lists) is useful in the MD context because the positions of the particles do not change much from step to step. This is one kind of application, but not the most general one - for instance to analyze trajectories, where steps not very correlated structurally, that may not be useful, and the pair lists have to be reconstructed from scratch each time.
(I would take a look on what OpenMM does, some insights here, maybe: http://docs.openmm.org/latest/developerguide/06_opencl_platform.html#reordering-of-particles).
I am not suprised at all, if they do the intial cell list on CPU, as it is very problemtic with GPU. But, I have very good idea on how to do the initial cell list on GPU. Updating the cell list for consecutive MD steps is easy. But, if this is done then MD program needs to adopt velocities and forces to the cell list structure also, which is doable.
In any case I have an idea on how this all can be pulled together, and we can then discuss shall we go for it or will we do something else.
I'm opening this issue to perhaps provide some information concerning the discussion that we might have on Jul 7.
I've tested quickly here the computation of a nb list with
NeighbourLists
andCellListMap
, and this is what I get:Am I'm doing something wrong?
Here, we have:
The coordinates of the example are random coordinates within that box, and the density is the atomic density of liquid water.