parallel optimization using OPENMP

alphataubio commented 5 months ago

where are the 2-3 most computationally-intensive loops in the fortran code that could be parallelized using OPENMP ?

lmiq commented 5 months ago

Thanks for the interest.

The most expensive loops are the "Minimum distance function" computation and its gradient. These are on lines:

160 of computef.f90

160 of computeg.f90

For what is worth, we are working on a new Packmol version with support for parallelization (https://github.com/m3g/Packmol.jl) and other features, so we are not involved too much in improving this code except for possible bug fixes and simple feature additions.

I would be interest in knowing what kind of system do you have, to understand the need for such increased performance.

alphataubio commented 5 months ago

For what is worth, we are working on a new Packmol version with support for parallelization (https://github.com/m3g/Packmol.jl) and other features, so we are not involved too much in improving this code except for possible bug fixes and simple feature additions.

ok then i wont put any effort into parallelizing the fortran code. often computationally-intensive algorithms are "trivially-parallel", all you need is a #pragma omp for around a few loops.

im surprised you're using julia as a programming language instead of C or C++ to modernize packmol. i havent heard much about julia, except that it's a scripted interpreted language like python, not usually associated with HPC performance but i can be wrong.

I would be interest in knowing what kind of system do you have, to understand the need for such increased performance.

spherical lipid bilayer, 400nm diameter, surface area ~5e7 A^2, {753982 cholesterol, 23935 DPPC, 8107 POPG, 181791 SSM, 58833 cardiolipin}, ~1000 transmembrane proteins.

lmiq commented 5 months ago

ok then i wont put any effort into parallelizing the fortran code. often computationally-intensive algorithms are "trivially-parallel", all you need is a #pragma omp for around a few loops.

Oh, that's not the case here. One needs to be careful in distributing tasks properly otherwise the scaling is extremely poor.

im surprised you're using julia as a programming language instead of C or C++ to modernize packmol. i havent heard much about julia, except that it's a scripted interpreted language like python, not usually associated with HPC performance but i can be wrong.

Julia is not really interpreted, it is compiled just-ahead-of-use. It "feels" like an interpreted language, thus being very convenient, and can be used like python, but it is designed for high performance. It also has built in parallelization directives.

spherical lipid bilayer, 400nm diameter, surface area ~5e7 A^2, {753982 cholesterol, 23935 DPPC, 8107 POPG, 181791 SSM, 58833 cardiolipin}, ~1000 transmembrane proteins.

that's indeed a big system, and also hard to pack because of the complexity of the molecules. I hope we can make the construction of such things easier with the next Packmol versions.

lmiq commented 5 months ago

As a side comment, for such a big system I would probably try to construct a coarse grained model and then obtain the all atom representation.

alphataubio commented 5 months ago

As a side comment, for such a big system I would probably try to construct a coarse grained model and then obtain the all atom representation.

that's actually what im doing, im preparing a cg simulation for spica force field. im not doing all-atoms

lmiq commented 5 months ago

On this paper you might find many tips to improve the use of Packmol for such very large systems:

https://pubs.acs.org/doi/full/10.1021/acs.jcim.0c01205

m3g / packmol

parallel optimization using OPENMP #62