Added new multicolour features from hack the tree

gtribello commented 4 months ago

Description

This PR transfers a lot of functionality from hack-the-tree into plumed2. I have described the features that I have added here:

https://plumed-school.github.io/lessons/23/001/data/NAVIGATION.html

Essentially the new code allows you to pass scalars, vectors, matrices and grids between the actions as opposed to just scalars. You can also pass these vectors, matrices and grids from plumed to python.

The changes made are mostly to the way multicolvar works (although the old syntax remains). I did make some changes to the way function works and some colvars like distance. The reason for these changes is that I have rewritten multicolvar in a way that allows one to reuse the implementations of the colvars. For example, I can calculate multiple distances and store them in a vector like this:

d1: DISTANCE ATOMS1=1,2 ATOMS2=3,4

To apply a function to each of the distances calculated by d1 I do the following;

fd: CUSTOM ARG=d1 FUNC=x*x PERIODIC=NO

Obviously, as d1 is a vector rather than a scalar some changes are required in function too.

I have done the benchmarks that we used for the last of these PRs and got the results shown below:

timings

I think you can conclude from this that the interface is not being slowed down by these changes.

There is a problem with the tests on intel on GitHub actions. I think the problem here is that the drive on the virtual machine gets filled up. I am not sure what to do about this problem. I did a google search and the suggestion was to delete stuff that GitHub installs for you that is taking up a lot of space. I hope we can discuss this problem soon as looking through some of the other action runs it seems to be affecting some of the other workflow runs.

Target release

2.10 release at the end of February.

Type of contribution

[x] changes to code or doc authored by PLUMED developers, or additions of code in the core or within the default modules
[ ] changes to a module not authored by you
[x] new module contribution or edit of a module authored by you

Copyright

[x] I agree to transfer the copyright of the code I have written to the PLUMED developers or to the author of the code I am modifying.

[x] the module I added or modified contains a COPYRIGHT file with the correct license information. Code should be released under an open source license. I also used the command cd src && ./header.sh mymodulename in order to make sure the headers of the module are correct.

Tests

[x] I added a new regtest or modified an existing regtest to validate my changes.
[x] I verified that all regtests are passed successfully on GitHub Actions.

gtribello commented 4 months ago

They are multicolvar features, obviously. Stupid autocorrect.

codecov-commenter commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 84.06%. Comparing base (460974a) to head (6e67f52).

:exclamation: Current head 6e67f52 differs from pull request most recent head 67766c0. Consider uploading reports for the commit 67766c0 to get more accurate results

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #1020 +/- ## ========================================== - Coverage 84.09% 84.06% -0.04% ========================================== Files 614 623 +9 Lines 56767 58947 +2180 ========================================== + Hits 47741 49553 +1812 - Misses 9026 9394 +368 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

GiovanniBussi commented 4 months ago

@gtribello I fixed the intel problem on master already, so it should be fine once you merge.

I will check the rest but it's massive, I am not sure how long it will take

gtribello commented 4 months ago

@GiovanniBussi how did you fix the problem with intel and when did you do it? I merged master into this branch yesterday and I am still seeing the same problem.

GiovanniBussi commented 4 months ago

In master branch the failure was during the build phase, possibly because Intel compiler installation takes some space. I removed some unneeded file to free approx 10Gb. I think they changed the runner a recently and they have slightly less space in the disk

If your failure is in the test phase it is either one test that takes a lot of space or some subtle bug in the code or in Intel compiler

carlocamilloni commented 4 months ago

I made my usual performance tests and there is some impact for alanine dipeptide (12,000/11,300 ns/day) but not for the large SAXS system.

GiovanniBussi commented 4 months ago

I am a bit worried by the many changes also to "normal" colvars. Are they really necessary at this stage?

@gtribello can you also post the input files of the benchmarks you did?

I am trying to browse the tree locally, because the changes are too large to be displayed on GitHub. Here a few things that I noticed:

ActionAtomistic::retrieveAtoms has a new optional argument (force) that is only used in overrides. Which is the logic here? The same is true for clearDerivatives
Is RETURN_DERIV something to be set by the user? I think it's dangerous because the user might set it by mistake. I think we should use a C++ function to set an internal variable, and not this string
Are the new switching functions needed? Did you try to see if CUSTOM is fast enough? I suspect it should be basically identical for these expensive trigonometric functions, but I might be wrong
There's some breaking change, e.g. in TORSION VECTOR1 -> TORSION VECTORA. I am fine with this (it's consistent with GROUPA in COORDINATION) but we need to list all of them in the changes file.

I see that now many normal colvars are multicolvars, and the source code is much less readable. The function calculateCV is quite confusing:

static void calculateCV( const unsigned& mode, const std::vector<double>& masses, const std::vector<double>& charges,
                       const std::vector<Vector>& pos, std::vector<double>& vals, std::vector<std::vector<Vector> >& derivs,
                       std::vector<Tensor>& virial, const ActionAtomistic* aa );

Why is this not a virtual function but a static one? I don't see the sense. Then you use mode to pass information to the static function. I can see what it does, but the code seem more complicated than it was before. Is this necessary?

It looks like LeptonCall duplicates code from switching functions, is this needed?
Why has been DUMPATOMS changed?
Is DUMPVECTOR different from DUMPATOMS? Why does it accept a REPLICA argument?

GiovanniBussi commented 4 months ago

@gtribello I better checked how calculateCV works. If I understood correctly:

calculateCV contains the real implementation
calculate calls calculateCV passing a bunch of arguments. These arguments are standardized and are the some ones for all CVs.
You to this to be able to "steal" the implementation and use it in colvar/MultiColvarTemplate.h.
You do the same for other methods such parseAtomList.

The problem of this design is that it freezes the interface of calculateCV, parseAtomList etc. Our traditional approach has been to ask users to implement a calculate() function, and then have access to a number of features by using public or protected methods. Now, if a CV needs to access the timestep in calculateCV it cannot do it. And we cannot expand the calculateCV interface in the future, because it's frozen.

I see that you are kind of distinguishing between something like:

"elemental" CVs, that you want to reuse as modular components in MultiColvarTemplate, and thus have only access to limited information. If I understood correctly, these variables can only access to positions and similar, and to a locally specified "mode".
"complicated" CVs, which are old fashioned once that can access a lot of stuff.

This makes sense. However, I was wondering if this design is easy to maintain. If so, we should discuss about the interface of calculateCV.

What confuses me is that the ActionAtomistic* argument is always passed in the last position (which is counterintuitive, because this is very much like self in python to me). I think it's always this in the code. Isn't this just another virtual function instead of a static one?

gtribello commented 4 months ago

Hello @GiovanniBussi

I am going to try to answer the points here:

ActionAtomistic::retrieveDerivatives has the force action because if you have an input like this:

d: DISTANCE ATOMS1=1,2 ATOMS2=3,4
f: CUSTOM ARG=d1 FUNC=x*x PERIODIC=NO

The calculation of f is done at the same time as d. I provide an explanation of this here: https://plumed-school.github.io/lessons/23/001/data/Tasks.html. I envisage the force option only being used in the override methods for clearDerivatives and retrieveAtoms that are in ActionWithVector.

In the new version of KDE (i.e. histograms) I use the switchingFunction class to compute the Gaussian Kernel. The RETURN_DERIV would only be used in here in my imagination. The reason for this is that I have had people in the past asking me to do all sorts of things with the gaussian kernels in the histograms. I figure if I use SwitchingFunction I can let them do whatever they want from the input file as they can specify a CUSTOM kernel using SwitchingFunction.
The new switching function types are not necessary. I have removed them. I added them because you have to specifically set D_MAX=R_0+D_0 to make them work. You can do that in the input for CUSTOM though and you are right that it is not a good enough reason to have the extra code.
Happy to put those changes to VECTORA and VECTORB in the CHANGELOG. I had to make this change to make inputs like this work:

t: TORSION VECTORA1=1,2 AXIS1=3,4 VECTORB1=5,6 VECTORA2=7,8 AXIS2=9,10 VECTORB2=11,12

for calculating multiple torsions at the same time.

I'll come back to colvars and MultiColvars
LeptonCall doesn't duplicate code from switching function. It allows you to pass multiple arguments to Lepton and compute a custom function. In other words, it is the code that was in CUSTOM. I wanted to reuse custom functions for other things, which is why I wrote this helper class.
DUMPATOMS has changed because you can now do:

q: CHARGE ATOMS=1-10
DUMPATOMS ATOMS=1-10 ARG=q

The charges are then output in a fourth column of the xyz file. I can also use this to replace the old DUMPMULTICOLVAR command. In other words, it reduces the amount of code we have to maintain.

DUMPVECTOR would be used to do this:

q: CHARGE ATOMS=1-10
DUMPVECTOR ARG=q

It outputs the 10 elements in the vector q in a column as opposed to on a row. You can see an example where this command is used in src/analysis/WhamWeights.cpp. The idea with the replica command was to provide a way for the user to not output the same information from multiple replicas. I am not sure it is useful. I can remove it if you prefer?

On the Colvar thing you have understood what I was aiming to do. The function is static because that is required to make it work with MultiColvarTemplate. I don't want to have to create a object of colvar type inside MultiColvarTemplate because that would make a mess as the only constructor that we have for the action requires an ActionOptions as input and I don't have the input to construct an ActionOptions for a DISTANCE action in MultiColvarTemplate.

The reason for passing the pointer to ActionAtomistic can be seen if you look in src/colvar/SelectMassCharge:

void SelectMassCharge::calculateCV( const unsigned& mode, const std::vector<double>& masses, const std::vector<double>& charges,
                                    const std::vector<Vector>& pos, std::vector<double>& vals, std::vector<std::vector<Vector> >& derivs,
                                    std::vector<Tensor>& virial, const ActionAtomistic* aa ) {
  if( aa->getName().find("MASSES")!=std::string::npos ) vals[0]=masses[0];
  else if( aa->chargesWereSet ) vals[0]=charges[0];
}

Passing that pointer also kind of solves your problem with getting the time as you can get it from the pointer to the underlying ActionAtomistic object. The interface there is pretty general. It allows you to pass in positions masses and changes and return one or multiple values, derivatives thereof and viral components thereof. In general, I don't see many reasons for implementing more of these elemental CVs (incidentally I really like this term). I mean what other elemental CVs are there to implement?

GiovanniBussi commented 4 months ago

@gtribello I checked the log of the intel build on my workstation and it warns me about this line for(unsigned j=0; j<natoms; ++i) should likely be for(unsigned j=0; j<natoms; ++j)

I don't think it is related to the current problem with intel , but I guess it's an error anyway and should be fixed

gtribello commented 4 months ago

Hi @GiovanniBussi

So I merged master into this branch and updated the change log. I also removed the RETURN_DERIV option from switchingFunction, which is what I think you were objecting to and reworked the code to ensure that I no longer need it. Also added destructors to places where you might get dependencies as we discussed this morning. Any suggestions as to how we can test that this works OK?

If you remove the rt-fes-periodic test everything runs on intel and it no longer hangs. You still get an error though, which you can see here:

https://github.com/plumed/plumed2/actions/runs/8084516980/job/22089957387

You can see that the problem is differences like this:

<    1.1100      inf      nan
---
>    0.0000      inf     nan

So the nan is slightly offset in the different versions. So it is not really anything serious. Just something really annoying.

gtribello commented 4 months ago

Hi @GiovanniBussi

I had a look into the problem with the differences from my previous message. I can fix this by changing the script regtest/scripts/fixezeros.sh from:

sed "s/-nan/nan/g" $file.$$.tmp > $file.$$.tmp2

to:

sed "s/-nan/ nan/g" $file.$$.tmp > $file.$$.tmp2

Does making that change make sense to you? I am not sure I understand how signs work with format strings well enough to justify it myself.

GiovanniBussi commented 4 months ago

If this works without modifying other tests I think you can do it...

gtribello commented 4 months ago

ciao @GiovanniBussi

I worked out why the code was hanging on when it is run with the intel compiler. Basically std::isinf doesn't return 1 if it is passed the number infinity when you use the intel compiler. This is supposedly "expected behaviour" so I have found a workaround in the place that I was using std::isinf. You have to laugh otherwise you would cry.

There are still some failures but they are now due to the problem with -nan that I found yesterday. Unfortunately, if I make the change to fixerzeros.sh that I sent you yesterday I get some failures in other places. I hope that one should be an easier fix though.

gtribello commented 4 months ago

Hi @GiovanniBussi

How do you feel about changing the regtests:

basic/rt-lepton basic/rt-lepton-asmjit

So that the print line in these plumed inputs is:

PRINT ARG=(f.*) FMT=%9.6f FILE=colvar
PRINT ARG=(c.*) FMT=%9.6f FILE=colvarc

(they are currently free format)

I think if I did that I could then change the fixerzeros.sh script in the way I described previously and have all the tests passing. Would that be OK?

GiovanniBussi commented 4 months ago

Makes sense!

Sent from Gmail mobile

Il giorno gio 29 feb 2024 alle 13:25 Gareth Tribello < @.***> ha scritto:

Hi @GiovanniBussi https://github.com/GiovanniBussi

How do you feel about changing the regtests:

basic/rt-lepton basic/rt-lepton-asmjit

So that the print line in these plumed inputs is:

PRINT ARG=(f.) FMT=%9.6f FILE=colvar PRINT ARG=(c.) FMT=%9.6f FILE=colvarc

(they are currently free format)

I think if I did that I could then change the fixerzeros.sh script in the way I described previously and have all the tests passing. Would that be OK?

— Reply to this email directly, view it on GitHub https://github.com/plumed/plumed2/pull/1020#issuecomment-1971029788, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBNCXQZEKO2DK2JZEXOV43YV4O3VAVCNFSM6AAAAABDC352YWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZRGAZDSNZYHA . You are receiving this because you were mentioned.Message ID: @.***>

GiovanniBussi commented 4 months ago

@gtribello once you have all green can you merge and push also a tag named 2.10a? Then we can think about the announcement in the next few days. But we need the tag today. Thanks!

gtribello commented 4 months ago

OK @GiovanniBussi

To create the tag is the command just

git tag 2.10a

on my machine? How do I then transfer the tag to the remote. With a push?

GiovanniBussi commented 4 months ago

git tag v2.10a

git push origin v2.10a

On Thu, Feb 29, 2024 at 4:05 PM Gareth Tribello @.***> wrote:

OK @GiovanniBussi https://github.com/GiovanniBussi

To create the tag is the command just

git tag 2.10a

on my machine? How do I then transfer the tag to the remote. With a push?

— Reply to this email directly, view it on GitHub https://github.com/plumed/plumed2/pull/1020#issuecomment-1971337804, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBNCXSSWAVNEBPOVYPFC33YV5BUVAVCNFSM6AAAAABDC352YWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZRGMZTOOBQGQ . You are receiving this because you were mentioned.Message ID: @.***>

gtribello commented 4 months ago

OK @GiovanniBussi it is done. Hurrah!!

plumed / plumed2