JuliaMolSim / Molly.jl

Molecular simulation in Julia
Other
372 stars 51 forks source link

Roadmap and ideas for Molly.jl development #2

Open jgreener64 opened 5 years ago

jgreener64 commented 5 years ago

This issue is a roadmap for Molly.jl development. Feel free to discuss things here or submit a PR. Bear in mind that significant refactoring will probably occur as the package develops.

Low hanging fruit

Want to get involved? These issues might be the place to start:

Bigger things to tackle

Projects that will require more planning or will cause significant changes

Okay, so this is less of a roadmap and more of a list of ideas to work on. But most or all of the above would be required to make this a package suitable for general use, so in a sense this is the roadmap.

Zarasthustra commented 5 years ago

To do: Add module for calculating Free Energy via Alchemical change ;)

jgreener64 commented 5 years ago

"Abstracting the forcefield from the integration to allow different forcefields and integrators to be combined" and "Allow definition of custom forcefields" are now done.

jarvist commented 4 years ago

It may be interesting to see whether you can either directly use https://github.com/libAtoms/JuLIP.jl as a library (also https://github.com/libAtoms/NeighbourLists.jl), or at least be inspired by some of the code therein.

jgreener64 commented 4 years ago

Thanks Jarvist, I have had a brief look at the code there but will dive a bit deeper.

Zarasthustra commented 4 years ago

Awhile ago I wrote some Julia code for EWALDS. It is verified for energy calculations against a NIST database.

https://www.nist.gov/mml/csd/chemical-informatics-research-group/spce-water-reference-calculations-10%C3%A5-cutoff [https://www.nist.gov/sites/default/files/styles/480_x_480_limit/public/images/mml/csd/informatics_research/spce_schematic_3.jpg?itok=OTb11Rb_]https://www.nist.gov/mml/csd/chemical-informatics-research-group/spce-water-reference-calculations-10%C3%A5-cutoff SPC/E Water Reference Calculations - 10Å cutoff | NISThttps://www.nist.gov/mml/csd/chemical-informatics-research-group/spce-water-reference-calculations-10%C3%A5-cutoff In this section, we provide sample configurations of SPC/E Water molecules[1] and report the various energy and force calculations for those configurations, where the coulombic contributions are computed using the traditional Ewald Summation Method[2]. These sample configurations and reference ... www.nist.gov

Would you be interested in it? I have not taken its derivative yet to calculate force. It has been on my list of things to do, but I just don't seem to have time :S

I have also added Rattle to my code for bond constraints... it is fairly straight forward too. Once your code has EWALD and Rattle, it will be a "full" code in my opinion. You would probably want to upgrade it to particle mesh ewald for MD, but that is purely performance.

Braden


From: Joe Greener notifications@github.com Sent: Wednesday, August 14, 2019 9:00 AM To: jgreener64/Molly.jl Molly.jl@noreply.github.com Cc: Braden Kelly bkelly08@uoguelph.ca; Comment comment@noreply.github.com Subject: Re: [jgreener64/Molly.jl] Roadmap and ideas for Molly.jl development (#2)

Thanks Jarvist, I have had a brief look at the code there but will dive a bit deeper.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/jgreener64/Molly.jl/issues/2?email_source=notifications&email_token=AECO7UJ4R725SDIJE6HZFD3QEP6XBA5CNFSM4GADVQT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4IWZUA#issuecomment-521235664, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AECO7UOWNMZMPDMOU6A5VMDQEP6XBANCNFSM4GADVQTQ.

jgreener64 commented 4 years ago

Would you be interested in it?

Yes, definitely. If you could open a PR when you get a chance that would be great. I understand how it's hard to find time though, I only seem to get a few hours a month on this package these days.

jgreener64 commented 4 years ago

There is a Julia interface to Chemfiles - building Molly on top of those types would allow easy reading and writing.

hypnopump commented 4 years ago

what are your thoughts on GPU acceleration or at least multi-core in CPU? In python it's really easy with joblib, cupy, jax... Don't know about the julia set of tools for that but might be worth an exploration

jgreener64 commented 4 years ago

GPU acceleration is feasible using CuArrays.jl - it would probably involve refactoring into a style that uses broadcasting rather than for loops, but that's no bad thing anyway.

Multi-core CPU support has improved greatly in the recent Julia v1.3 release so this is another avenue to explore, though I personally don't have much experience there.

Another point to add here is the ability to differentiate through molecular simulations using Zygote.jl, which is an exciting use case and something that Julia is well-suited for.

cortner commented 4 years ago

I was unfortunately unaware of this package until now. (Or maybe I saw it and have forgotten...) Anyhow, I wonder whether there is any chance to combine Molly and JuLIP, or at least make them compatible.

With JuLIP I deliberately stayed quite close to ASE, to make sure that I can have a relatively thin interface. But this has led to some design decision that I‘ve since regretted. So in principle at least I‘m quite open to restructuring JuLIP to make it work with Molly.

Another possibility would be to make JuLIP really library for potentials, but let Molly focus on managing configurations, MD, geometry optimisation, etc ...

But maybe it is easiest to leave them separate and not worry too much about this. I‘m curious though what people think.

cortner commented 4 years ago

Regarding the neighbourlist - this is surely something that we can share. I believe my list is fairly fast, it is based on matscipy. The main downside for Molly is that it uses the same cutoff for all atoms. It would need to be generalised so that different atoms can have a different cutoff.

I‘ve also started playing around with implementing various iterators over the neighbourlist to have a more functional style in implementing potentials. I‘m less sure how well that worked.

jgreener64 commented 4 years ago

Thanks for commenting @cortner. Once I get some time for this package, diving into JuLIP is definitely on the list of priorities to see what can be combined. Certainly with things you have optimised, like the neighbour list, there is room for Molly to use or build on that.

cortner commented 4 years ago

or extract shared functionality so that JuLIP and Molly become effectively frontends...

AlfTetzlaff commented 4 years ago

Hi! I observed some things in your code which could be optimized or written the "julia way". 1) Use StaticArrays to store positions and velocities, instead of your self-defined mutable structs. This might speed things up quite a lot due to allowing SIMD. 2) Implement a dist(a,b,box) function specialized for static arrays which takes the boundary conditions into account. 3) r2 = dx*dx+... can then be written as r^2. The pow function has specializations for 1,2 and 3, so this produces no overhead compared to C. 4) If you're using Atom/Juno you can use the Juno.@profiler to find bottlenecks 5) There is a discourse thread about "struct of arrays vs array of structs". I dont remember who won, but maybe it's instructive to read. 6) @inline the update_accelerations! functions 7) Use structs with parametric types, so that you can easily switch to e.g Float32 for GPUs. (SVectors are parametric.) 8) In theory, just an idea, one could leave the time-integration to another package like DifferentialEquations. They provide a whole lot of explicit or implicit solvers (but velocity verlet is missing I think) with adaptable or fixed time steps.

Kind regards, Max

cortner commented 4 years ago

Some of this I do in JuLIP, though not yet as well as I’d like, it needs more cleanup to get proper type-agnostic code eg. - Another reason maybe to join forces at some pointZ

jgreener64 commented 4 years ago

Thanks for the feedback @AlfTetzlaff, I would definitely like to get some of this in. I have tried a few of the suggestions previously, for example the positions and velocities are currently mutable static vectors (sub-types of FieldVector) which should hopefully be giving some of the speedups you talk about.

The types should be made parametric and it would help with GPU compatibility, though at the minute the code is not written in a GPU-friendly way so that would need to change first. I would like to use the DifferentialEquations integrators in the long-term since they make so many available, but for the proof-of-concept here I just implemented a simple velocity Verlet.

I have opened a Discourse post with a specific MWE based on the current non-bonded force code to solicit feedback - please feel free to comment more there.

Zarasthustra commented 4 years ago

I did a whole bunch of timing on many different possible LJ calculations for force and at the end of it, using SVector was the fastest.

Also, simple things like image separation could be done in a vectorized for loop rather than sequentially, and there was some time saving there too. Not enough to restructure your code just for that, but none the less, some.

I have put my Molecular Dynamics coding on hold and have started doing Monte Carlo (I just like MC better, and find statistical mechanics more interesting).

So far my MC code can do rigid SPC/E water. So, somewhat of a toy system as well, but, It can do Wolf summation or Ewalds for the electrostatic. There is nothing toy about Ewalds. Once I have it cleaned up, I will upload it and share with you guys. The most important thing is that it uses OffsetArrays.jl which is a wonderful package. It allows indices to take on any value, not just 1 and up. This just makes the fourier piece nice since the lattice vectors range from -kmax:kmax.

Julia does Ewalds really well. The fourier part is bloody fast and the "hard" part if there is one. The real part is simple and similar to a bare coulomb calculation.

I am also planning on making some ewald unit tests for energy. Force would also be nice, but that is in the future.

So, now that I am back part time on the Julia wagon, I will try the ewald contributions to you guys soon.

NOTE: -------------------------------------- Here is some fortran code that does force ewald force routines. It will take a good day to go over what they did and figure out how to make it work, which is why I have been putting off the force calculation and stuck with the energy.

http://zeolites.cqe.northwestern.edu/Music/electrostatic/ewald.htm zeolites.cqe.northwestern.eduhttp://zeolites.cqe.northwestern.edu/Music/electrostatic/ewald.htm This modules contains routines for those nasty ewald summations.! Good reference for ewald: Allen and Tildesley, Computer Simulation of ! Liquids, p 155-162; Amit's notes for doing part of the calculation in the zeolites.cqe.northwestern.edu

Braden


From: Joe Greener notifications@github.com Sent: Friday, May 8, 2020 5:47 PM To: jgreener64/Molly.jl Molly.jl@noreply.github.com Cc: Braden Kelly bkelly08@uoguelph.ca; Comment comment@noreply.github.com Subject: Re: [jgreener64/Molly.jl] Roadmap and ideas for Molly.jl development (#2)

CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca

Thanks for the feedback @AlfTetzlaffhttps://github.com/AlfTetzlaff, I would definitely like to get some of this in. I have tried a few of the suggestions previously, for example the positions and velocities are currently mutable static vectors (sub-types of FieldVector) which should hopefully be giving some of the speedups you talk about.

The types should be made parametric and it would help with GPU compatibility, though at the minute the code is not written in a GPU-friendly way so that would need to change first. I would like to use the DifferentialEquations integrators in the long-term since they make so many available, but for the proof-of-concept here I just implemented a simple velocity Verlet.

I have opened a Discourse posthttps://discourse.julialang.org/t/help-make-my-ideal-gas-simulation-ideally-fast/39141 with a specific MWE based on the current non-bonded force code to solicit feedback - please feel free to comment more there.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/jgreener64/Molly.jl/issues/2#issuecomment-626035133, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AECO7UOAPLRP4FM36MI4VDLRQR4WBANCNFSM4GADVQTQ.

jgreener64 commented 4 years ago

Great, thanks for the input.

jgreener64 commented 4 years ago

Here's a sneak preview of some non-bonded force optimisations I've been working on: faster single-CPU code, a multithreaded CPU implementation and a GPU implementation.

Hoping to refactor the package to allow these from the same API within a couple of weeks.

image

AlfTetzlaff commented 4 years ago

Could it be that you confused us with ms in the above graph by accident?

jgreener64 commented 4 years ago

It's ms but for a 1000-step simulation, so it would be us per step. The blue dot for 100 atoms is the simulation benchmark time from the Discourse post.

jgreener64 commented 4 years ago

Some changes I added this weekend:

AlfTetzlaff commented 4 years ago

Cool! Have you noticed any single threaded speedup due to SVectors?

jgreener64 commented 4 years ago

Yes, the Discourse thread benchmark went from 45 ms to 36 ms using SVectors, rather than the mutable FieldVectors used before.

Making Simulation immutable also sped things up.

The cell neighbour list would be a big algorithmic speedup (as discussed above).

jgreener64 commented 4 years ago

My rough feature list for a v0.1 release is:

Chenghao-Wu commented 2 years ago

May I have your thoughts on the parallelization via domain decomposition in Julia?

jgreener64 commented 2 years ago

It's not really my area of expertise, but I guess it can be done and it looks like it could be a useful algorithm. I don't know how well that type of simulation would work in the Molly framework currently though.

cortner commented 2 years ago

It would be an interesting proposition and a small step towards replacing lammps

jgreener64 commented 2 years ago

I thought I'd just update this issue with the state of the project.

In the last few months Molly has been under heavy development and has seen many new features including Unitful.jl support, more interactions, an OpenMM XML force field reader, AtomsBase.jl integration, differentiable simulation with Zygote.jl on CPU and GPU, a new general interaction type, implicit solvation models, more simulators, energy minimisation, more analysis functions and better test coverage. See the release notes for more details.

Future development directions include:

A few groups are interested in using Molly for a diverse range of simulations, so the intention is that it will stay general enough to be widely useful whilst also reaching high performance. My personal research uses Molly for differentiable simulations of proteins, so this will continue to dictate the features I work on.

uliano commented 1 year ago

May be of interest a native Julia (as in no-external-dependencies) .xtc reader and writer? It is twice as fast than c implementation used by mdtraj.

jgreener64 commented 1 year ago

That would be great. Currently we are using Chemfiles for that kind of thing but a fast native Julia approach would be welcome. It could even be its own package?

uliano commented 1 year ago

At the moment it is just a file with a bunch of functions.

It started as a proof of concept, learning Julia, while showing its potential to both speed and low level pesky tasks with the far future eventual goal to assemble something resembling the bad copy of MDAnalysis or MDtraj.

I'm, at the moment, too ignorant of Julia packaging system, and of git and github machinery, to be able to provide it. However with some help I'd be glad to contribute.

My original goal was to be able to natively read and write trajectory in the formats I use, namely gromacs .xtc, charmm namd .dcd, amber .nc and desmond .dtr. After the unexpected (at least speedwise) success with .xtc I got quickly stranded by netcdf3 underlying .nc and it all went on temporary hold.

The the functions are pretty solid as I tested low level ones against the c implementation while I was able to read and write real .xtc file with no diff from original and to compare the read coordinates with MDtraj's. It would be a pity not to use them, also given the effort done in understanding and optimizing the original c code (c implementation slowness is obviously due to implementation itself and not to the language)

I'm writing too much here is the thing

https://github.com/uliano/MDTools.jl

I'm willing to help in my spare time but keep in mind that at the moment I'm not proficient enough in both Julia and github packages machinery to be autonomous.

jarvist commented 1 year ago

I think it looks pretty good! It wouldn't be too much effort to rename your repo 'xtcreader.jl' or similar & then add the boilerplate to make it an independent package.

uliano commented 1 year ago

It actually writes, also!

uliano commented 1 year ago

So I did my homework and now we have a Julia package with xtc read and write functionality

https://github.com/uliano/MDTrajectoryFiles.jl

I added minimal docstrings on the exported functions and a very basic test: reads and writes then compares. I will eventually implement other file formats so I choose a more generic name than XTC.

uliano commented 1 year ago

This is the last comment for a while, I promise! I just wanted to say that MDTrajectoryFiles.jl has been accepted in the Julia registry.

Foreseeable changes in the next few weeks:

CoryKornowicz commented 1 year ago

Using Metal.jl, I was able to swap all the CuArrays to MtlArrays for better performance on Apple M-series chips. Might be something to look into as a 'low hanging fruit.'

jgreener64 commented 1 year ago

That's interesting, thanks. Once the kernels branch is merged (with >10x speedups on GPU) it would be worth allowing and documenting this.

See https://github.com/JuliaMolSim/Molly.jl/pull/99 for work towards generic array types.

exAClior commented 1 year ago

Hi, I notice that Molly uses Zygote for AD. Not sure if it's causing issue performance-wise. If so, a question may be: Will there be any plans to switching to alternatives like Enzyme.jl? Are there any significant road-blocks that will require an extensive work on this subject?

jgreener64 commented 1 year ago

Yes, see the kernels branch for a lot of progress on this. It uses Enzyme.jl for AD in the slow parts and writes GPU kernels, increasing performance and memory utilisation dramatically.

Hoping to merge and release soon, just waiting on some docs cleanup and a new Enzyme.jl release.

jgreener64 commented 1 year ago

v0.15.0 is out with many improvements including GPU speed and memory usage.

Priorities for future development are:

Hopefully there will be a GSoC student(s) working on Molly this summer too.

jgreener64 commented 1 year ago

Pressure coupling added via the Monte Carlo barostat.

@JaydevSR will be looking at GPU performance this summer as part of JSoC.