lorenzo-rovigatti / oxDNA

A new version of the code to simulate the oxDNA/oxRNA models, now equipped with Python bindings
https://dna.physics.ox.ac.uk/
GNU General Public License v3.0
38 stars 26 forks source link

Benchmarking performance: hilbert sorting, lists, force, thermostat etc #74

Closed swapnilbaral-git closed 8 months ago

swapnilbaral-git commented 8 months ago

What is the documentation missing? The meaning behind different terms that are used to benchmark the performance such as Hilbert sorting, Lists, Force, First step, Thermostat etc.

Additional context While trying to benchmark the performance (time/step) of DNA of various number of nucleotides(nt), I noticed that the time/step is higher for 3200 nt than 4800 nt . I wanted to understand what was the reason this so I began looking deeper. I saw that together with time/step, more info is written in the log files such as Hilbert sorting, Lists, Force, First step, Thermostat etc. In my case it takes more time to compute Lists in 3200 nt than 4800 nt (while computation time of all other parameters are lower in 3200 nt. But I could not tell why exactly was this happening. So I started wondering what is actually being done in part of the code such as during Hilbert sorting, Lists ... etc. Please let me know where I can read about this.

ErikPoppleton commented 8 months ago

I don't think there is a comprehensive guide to performance optimization anywhere, unfortunately. Something to add to the list of things to add to the documentation. That being said, Lorenzo wrote a paper on the optimization of the GPU code, which you can find here.

In your specific case I would guess that the way to optimize the performance would be to play with the verlet_skin parameter in the input file. This controls the balance between the size of each particle's neighbor list and how frequently the list needs to be updated (larger list -> longer force calculation, but updating the list is really slow). You probably already know this, but the list of input parameters can be found in the documentation. Verlet_skin is under "Core Options".

swapnilbaral-git commented 8 months ago

Thanks for the comment. I saw that I have used larger box size for 3200 nt than 4800 nt. I have not used any systematic method to decide the box size. Can you please tell me what is the best way to determine the required box size for DNA of different length ?

Also, what is the way to optimize the value of verlet_skin in the input ? I am not sure what are the possible values of this parameter and which metric should I compare for different values of verlet_skin.

ErikPoppleton commented 8 months ago

The box needs to be large enough that the DNA cannot interact with itself through the periodic boundary. As a rule-of-thumb, we usually go 1.5x largest dimension of the structure, but that's slightly overkill, you can go a bit smaller, but watch out for weird interactions.

The default input file that I use has verlet_skin=0.5 which I have found to generally be pretty good. If I am trying to optimize it, I usually vary it from 0.1 to 1.5.

lorenzo-rovigatti commented 8 months ago

The box size has a non-trivial effect on performance, and its interaction with other parameters (salt concentration, verlet skin, and possibly others) make it hard to come up with a way of automatically set the best options for a given case. In general, to avoid crashing simulations, oxDNA sets the size of the cells used to build interaction lists so that the memory consumption is not too high. However, "not too high" depends on many things. If you want to optimise performance is sometimes worth to set cells_auto_optimisation = false so that oxDNA uses the smallest possible cells (at the cost of memory consumption). If the resulting memory footprint can be handled by your GPU you'll probably see some (possibly large) performance gains.

As for verlet_skin, a value between 0.1 and 0.2 is usually best.

lorenzo-rovigatti commented 8 months ago

I have added this page to the docs. I'll try to add more info in the future.