Parameter sets - Githubissues

proteneer commented 6 years ago

I noticed that the parameters in

https://github.com/isayev/ASE_ANI/blob/master/ANI-c08f-ntwk/rHCNO-4.6A_16-3.1A_a4-8.params

differ from what's in the paper

TM = 1
Rcr = 4.6000e+00
Rca = 3.1000e+00
EtaR = [1.6000000e+01]
ShfR = [5.0000000e-01,7.5625000e-01,1.0125000e+00,1.2687500e+00,1.5250000e+00,1.7812500e+00,2.0375000e+00,2.2937500e+00,2.5500000e+00,2.8062500e+00,3.0625000e+00,3.3187500e+00,3.5750000e+00,3.8312500e+00,4.0875000e+00,4.3437500e+00]
Zeta = [8.0000000e+00]
ShfZ = [0.0000000e+00,7.8539816e-01,1.5707963e+00,2.3561945e+00,3.1415927e+00,3.9269908e+00,4.7123890e+00,5.4977871e+00]
EtaA = [6.0000000e+00]
ShfA = [5.0000000e-01,1.1500000e+00,1.8000000e+00,2.4500000e+00]
Atyp = [H,C,N,O]

Namely, this produces a feature vector with 384 as opposed to the ~700 floats mentioned in the paper. In addition, the NN it self seems to be 256x128x64x1 as opposed to 128x128x64x1


!InputFile for Force Prediction Network
sflparamsfile=rHCNO-4.6A_16-3.1A_a4-8.params
ntwkStoreDir=networks/
atomEnergyFile=../../sae_6-31gd.dat
nmax=0! Maximum number of iterations (0 = inf)
tolr=100! Tolerance - early stopping
emult=0.25!Multiplier by eta after tol switch
eta=0.0001! Eta -- Learning rate
tcrit=1.0E-5! Eta termination criterion
tmax=0! Maximum time (0 = inf)
tbtchsz=1024
vbtchsz=1024
gpuid=2
seed=82131921
runtype=ANNP_CREATE_HDNN_AND_TRAIN!Create and train a HDN network
network_setup {
  inputsize=384;
  layer [
        nodes=256;
        activation=5;
        type=0;
        dropout=0;
        dropset=0.5;
        maskupdate=0.999;
        maxnorm=1;
        norm=3.0;
        normupdate=0.9999;
  ]
  layer [
        nodes=128;
        activation=5;
        type=0;
        dropout=0;
        dropset=0.5;
        maskupdate=0.999;
        maxnorm=1;
        norm=3.0;
        normupdate=0.9999;
  ]
  layer [
        nodes=64;
        activation=5;
        type=0;
        dropout=0;
        dropset=0.5;
        maskupdate=0.999;
        maxnorm=1;
        norm=3.0;
        normupdate=0.9999;
  ]
  layer [
        nodes=1;
        activation=6;
        type=0;
  ]
}
adptlrn=OFF ! Adaptive learning (OFF,RMSPROP)
decrate=0.9 !Decay rate of RMSPROP
moment=ADAM! Turn on momentum or nesterov momentum (OFF,CNSTTEMP,TMANNEAL,REGULAR,NESTEROV)
mu=0.99 ! Mu factor for momentum

Do you mind clarifying what are the canonical parameters are needed to reproduce the paper given the dataset?

proteneer commented 6 years ago

PS the main issue is that we're having some difficultites trying to reproduce the gdb-10 results, (we've repro'd the val/test results).

Jussmith01 commented 6 years ago

Okay, a couple of things. 1) The network in this repo is not the same one as that from the paper. This one was trained to the ANI-1 data set + some amino acid and peptide data. Also, through hyper parameter searching we determine the AEV parameters used here work just as well as for the 768 sized AEV on the ANI-1 + peptide data set. 2) In the paper we trim energies > 300kcal from each set of conformers minimum for the GBD-10 test. This may not have been explicitly mentioned in the paper, but is clear from the range in figure 4 that this is what we are comparing. The high energy GDB-10 stuff is VERY hard to fit to if you are using the trimmed (@275kcal/mol) version of the ANI-1 data set (which is what we used in the paper and recently published as the "low" energy part of the ANI-1 data set).

As it turns out I recently built an ensemble of original ANI-1 networks (5 of our model trained to a 5 fold cross-validation style split of the ANI-1 "low" energy data set) to compare on a new benchmark I have been developing. The new networks were developed with the same parameter file used in this repository. For the ensemble we get a prediction of 1.7kcal/mol RMSE. You can view these results here (this notebook will also show you how we do the comparison):

https://github.com/Jussmith01/ANI-Tools/blob/master/notebooks/eval_testset.ipynb

If you'd like me to make the ANI-1 ensemble available on this repo for comparison I can do that.

proteneer commented 6 years ago

@Jussmith01 Thank you for the very detailed explanation and the notebook. We've confirmed internally and our test scores become significantly better after pruning the high energy conformations. For many of the applications we care about, we typically only consider the conformations in <100kcal/mol range (you report using 300kcal/mol).

We did some analysis on the training set as well, of the 22 million conformations you provide, about 6 million of them have >100 kcal/mol energy differences from the minimum. It looks like this dataset has a fairly large number of outliers, some with rather interesting geometries (smaller C=O bonds, as an example).

Jussmith01 commented 6 years ago

6M > 100kcal/mol of the 22M sounds about right. With regular normal mode sampling it will by default bias conformations towards energy minima. We have since refined our methods and have a soon to be submitted paper that covers this topic a little. As for weird geometries, it can happen when using a harmonic approximation to determine the structural perturbations. However, it is a very cheap way to generate non-equilibrium conformations and from what we have seen it works well when you filter out high energy conformations (which tend to be the weird structures).

isayev / ASE_ANI

Parameter sets #14