qzhu2017 / CSP_BO

Crystal Structure Prediction with Bayesian Optimization
0 stars 1 forks source link

To try Pt-H2O data #17

Open qzhu2017 opened 4 years ago

qzhu2017 commented 4 years ago
qzhu2017 commented 4 years ago
qzhu@cms CSP_BO (master) $ python example_validate.py models/test.json database/PtHO.db 
------Gaussian Process Regression------
Kernel: 0.925**2 *Dot(length=4.256) 1 energy (0.002) 104 forces (0.024)

load the GP model from  models/test.json
Train Energy [   1]: R2 0.9975 MAE  0.000 RMSE  0.000
Train Forces [ 312]: R2 0.9645 MAE  0.013 RMSE  0.018
   1 E: -5.265 -> -5.265  F_MSE:  0.033 
   2 E: -5.265 -> -5.265  F_MSE:  0.033 
   3 E: -5.265 -> -5.265  F_MSE:  0.035 
   4 E: -5.265 -> -5.265  F_MSE:  0.034 
   5 E: -5.265 -> -5.265  F_MSE:  0.034 
   6 E: -5.265 -> -5.265  F_MSE:  0.034 
   7 E: -5.265 -> -5.265  F_MSE:  0.033 
   8 E: -5.265 -> -5.265  F_MSE:  0.033 
   9 E: -5.265 -> -5.265  F_MSE:  0.034 
  10 E: -5.265 -> -5.265  F_MSE:  0.034 
  11 E: -5.265 -> -5.265  F_MSE:  0.034 
  12 E: -5.265 -> -5.265  F_MSE:  0.034 
  13 E: -5.265 -> -5.265  F_MSE:  0.033 
  14 E: -5.265 -> -5.265  F_MSE:  0.033 
  15 E: -5.265 -> -5.265  F_MSE:  0.034 
  16 E: -5.265 -> -5.265  F_MSE:  0.034 
  17 E: -5.265 -> -5.265  F_MSE:  0.034 
  18 E: -5.265 -> -5.265  F_MSE:  0.034 
  19 E: -5.265 -> -5.265  F_MSE:  0.033 
  20 E: -5.265 -> -5.265  F_MSE:  0.033 
  21 E: -5.265 -> -5.265  F_MSE:  0.034 
  22 E: -5.265 -> -5.265  F_MSE:  0.034 
  23 E: -5.265 -> -5.265  F_MSE:  0.034 
  24 E: -5.265 -> -5.265  F_MSE:  0.034 
  25 E: -5.265 -> -5.265  F_MSE:  0.033 
  26 E: -5.265 -> -5.265  F_MSE:  0.033 
  27 E: -5.265 -> -5.265  F_MSE:  0.034 
  28 E: -5.265 -> -5.265  F_MSE:  0.034 
  29 E: -5.265 -> -5.265  F_MSE:  0.034 
  30 E: -5.265 -> -5.265  F_MSE:  0.034 
  31 E: -5.265 -> -5.265  F_MSE:  0.032 
  32 E: -5.265 -> -5.265  F_MSE:  0.033 
  33 E: -5.265 -> -5.265  F_MSE:  0.034 
  34 E: -5.265 -> -5.265  F_MSE:  0.033 
  35 E: -5.265 -> -5.265  F_MSE:  0.035 
  36 E: -5.265 -> -5.265  F_MSE:  0.034 
  37 E: -5.265 -> -5.265  F_MSE:  0.033 
  38 E: -5.265 -> -5.265  F_MSE:  0.034 
  39 E: -5.265 -> -5.265  F_MSE:  0.034 
  40 E: -5.265 -> -5.265  F_MSE:  0.034 
  41 E: -5.265 -> -5.265  F_MSE:  0.034 
  42 E: -5.265 -> -5.265  F_MSE:  0.034 
  43 E: -5.265 -> -5.265  F_MSE:  0.032 
  44 E: -5.265 -> -5.265  F_MSE:  0.033 
  45 E: -5.265 -> -5.265  F_MSE:  0.035 
  46 E: -5.265 -> -5.265  F_MSE:  0.033 
  47 E: -5.265 -> -5.265  F_MSE:  0.035 
  48 E: -5.265 -> -5.265  F_MSE:  0.034 
  49 E: -5.265 -> -5.265  F_MSE:  0.033 
  50 E: -5.265 -> -5.265  F_MSE:  0.034 
  51 E: -5.265 -> -5.265  F_MSE:  0.034 
  52 E: -5.265 -> -5.265  F_MSE:  0.033 
  53 E: -5.265 -> -5.265  F_MSE:  0.035 
  54 E: -5.265 -> -5.265  F_MSE:  0.034 
  55 E: -5.265 -> -5.265  F_MSE:  0.033 
  56 E: -5.265 -> -5.265  F_MSE:  0.033 
  57 E: -5.265 -> -5.265  F_MSE:  0.035 
  58 E: -5.265 -> -5.265  F_MSE:  0.033 
  59 E: -5.265 -> -5.265  F_MSE:  0.035 
  60 E: -5.265 -> -5.265  F_MSE:  0.034 
  61 E: -5.265 -> -5.265  F_MSE:  0.033 
  62 E: -5.265 -> -5.265  F_MSE:  0.034 
  63 E: -5.265 -> -5.265  F_MSE:  0.034 
  64 E: -5.265 -> -5.265  F_MSE:  0.033 
  65 E: -5.265 -> -5.265  F_MSE:  0.035 
  66 E: -5.265 -> -5.265  F_MSE:  0.034 
  67 E: -5.265 -> -5.265  F_MSE:  0.033 
  68 E: -5.265 -> -5.265  F_MSE:  0.033 
  69 E: -5.265 -> -5.265  F_MSE:  0.035 
  70 E: -5.265 -> -5.265  F_MSE:  0.033 
  71 E: -5.265 -> -5.265  F_MSE:  0.035 
  72 E: -5.265 -> -5.265  F_MSE:  0.034 
  73 E: -5.265 -> -5.265  F_MSE:  0.033 
  74 E: -5.265 -> -5.265  F_MSE:  0.034 
  75 E: -5.265 -> -5.265  F_MSE:  0.034 
  76 E: -5.265 -> -5.265  F_MSE:  0.033 
  77 E: -5.265 -> -5.265  F_MSE:  0.034 
  78 E: -5.265 -> -5.265  F_MSE:  0.034 
  79 E: -5.265 -> -5.265  F_MSE:  0.033 
  80 E: -5.265 -> -5.265  F_MSE:  0.033 
  81 E: -5.265 -> -5.265  F_MSE:  0.034 
  82 E: -5.265 -> -5.265  F_MSE:  0.033 
  83 E: -5.265 -> -5.265  F_MSE:  0.035 
  84 E: -5.265 -> -5.265  F_MSE:  0.034 
  85 E: -5.265 -> -5.265  F_MSE:  0.033 
  86 E: -5.265 -> -5.265  F_MSE:  0.033 
  87 E: -5.265 -> -5.265  F_MSE:  0.035 
  88 E: -5.265 -> -5.265  F_MSE:  0.033 
  89 E: -5.265 -> -5.265  F_MSE:  0.034 
  90 E: -5.265 -> -5.265  F_MSE:  0.034 
  91 E: -5.265 -> -5.265  F_MSE:  0.033 
  92 E: -5.265 -> -5.265  F_MSE:  0.033 
  93 E: -5.265 -> -5.265  F_MSE:  0.034 
  94 E: -5.265 -> -5.265  F_MSE:  0.033 
  95 E: -5.265 -> -5.265  F_MSE:  0.034 
  96 E: -5.265 -> -5.265  F_MSE:  0.034 
  97 E: -5.265 -> -5.265  F_MSE:  0.033 
  98 E: -5.265 -> -5.265  F_MSE:  0.034 
  99 E: -5.265 -> -5.265  F_MSE:  0.034 
 100 E: -5.265 -> -5.265  F_MSE:  0.034 
Test Energy [ 100]: R2 0.5746 MAE  0.000 RMSE  0.000
Test Forces [57600]: R2 0.9087 MAE  0.020 RMSE  0.034
5326.568 seconds elapsed
save the figure to  E.png
save the figure to  F.png

image

The results are not bad. Just too slow Need to fix the #10 before getting back to this issue.

qzhu2017 commented 4 years ago

10/30/2020 The CUPY version (35s/structure) is faster than 24 CPU (53s/structure)

qzhu@cms CSP_BO (master) $ python example_validate.py models/PtHO.json database/PtHO.db
------Gaussian Process Regression------
Kernel: 0.925**2 *Dot(length=4.256) 1 energy (0.002) 104 forces (0.024)

load the GP model from  models/PtHO.json
gpu
Train Energy [   1]: R2 0.9975 MAE  0.000 RMSE  0.000
Train Forces [ 312]: R2 0.9645 MAE  0.013 RMSE  0.018
False
   1 E: -5.265 -> -5.265  F_MSE:  0.033 
   2 E: -5.265 -> -5.265  F_MSE:  0.033 
   3 E: -5.265 -> -5.265  F_MSE:  0.035 
   4 E: -5.265 -> -5.265  F_MSE:  0.034 
   5 E: -5.265 -> -5.265  F_MSE:  0.034 
Test Energy [   5]: R2 0.9825 MAE  0.000 RMSE  0.000
Test Forces [2880]: R2 0.9087 MAE  0.020 RMSE  0.034
176.038 seconds elapsed
save the figure to  E.png
save the figure to  F.png
qzhu2017 commented 4 years ago

11/08/2020

qzhu@cms CSP_BO (master) $ python example_validate.py models/PtHO.json database/PtHO.db 
------Gaussian Process Regression------
Kernel: 0.925**2 *Dot(length=4.256) 1 energy (0.002) 104 forces (0.024)

load the GP model from  models/PtHO.json
gpu
Train Energy [   1]: R2 0.9975 MAE  0.000 RMSE  0.000
Train Forces [ 312]: R2 0.9645 MAE  0.013 RMSE  0.018
False
   1 E: -5.265 -> -5.265  F_MSE:  0.033 
   2 E: -5.265 -> -5.265  F_MSE:  0.033 
   3 E: -5.265 -> -5.265  F_MSE:  0.035 
   4 E: -5.265 -> -5.265  F_MSE:  0.034 
   5 E: -5.265 -> -5.265  F_MSE:  0.034 
Test Energy [   5]: R2 0.9825 MAE  0.000 RMSE  0.000
Test Forces [2880]: R2 0.9087 MAE  0.020 RMSE  0.034
66.691 seconds elapsed
save the figure to  E.png
save the figure to  F.png
qzhu2017 commented 4 years ago

@yanxon Can you update the code and run the following command

$python example_sampling.py database/PtHO.db > log-PtHO &

This is a code to construct the GPR force model for the PtHO data. It will probably take a couple of hours. We will have a discussion on the results tomorrow.

yanxon commented 4 years ago

@qzhu2017

I am running this now. I will update the results after it's done.

qzhu2017 commented 4 years ago

At some point, it will complain that cuda is running out of memory;

  File "example_sampling.py", line 55, in <module>
    model.fit()
  File "/scratch/qzhu/github/CSP_BO/cspbo/gaussianprocess_ef.py", line 85, in fit
    params, loss = self.optimize(obj_func, hyper_params, hyper_bounds)
  File "/scratch/qzhu/github/CSP_BO/cspbo/gaussianprocess_ef.py", line 448, in optimize
    jac=True, options={'maxiter': 10, 'ftol': 1e-3})
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/_minimize.py", line 618, in minimize
    callback=callback, **options)
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/lbfgsb.py", line 308, in _minimize_lbfgsb
    finite_diff_rel_step=finite_diff_rel_step)
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 262, in _prepare_scalar_function
    finite_diff_rel_step, bounds, epsilon=epsilon)
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py", line 76, in __init__
    self._update_fun()
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py", line 166, in _update_fun
    self._update_fun_impl()
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py", line 73, in update_fun
    self.f = fun_wrapped(self.x)
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/_differentiable_functions.py", line 70, in fun_wrapped
    return fun(x, *args)
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 74, in __call__
    self._compute_if_needed(x, *args)
  File "/scratch/qzhu/anaconda3/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 68, in _compute_if_needed
    fg = self.fun(x, *args)
  File "/scratch/qzhu/github/CSP_BO/cspbo/gaussianprocess_ef.py", line 68, in obj_func
    params, eval_gradient=True, clone_kernel=False)
  File "/scratch/qzhu/github/CSP_BO/cspbo/gaussianprocess_ef.py", line 406, in log_marginal_likelihood
    K, K_gradient = kernel.k_total_with_grad(self.train_x)
  File "/scratch/qzhu/github/CSP_BO/cspbo/Dot_mb.py", line 123, in k_total_with_grad
    C_ff, C_ff_s, C_ff_l = self.kff_many(data1[key1], data2[key2], True, True)
  File "/scratch/qzhu/github/CSP_BO/cspbo/Dot_mb.py", line 250, in kff_many
    C[i] = K_ff(x1, x_all, dx1dr, dxdr_all, sigma2, sigma02, zeta, grad, mask, device=self.device)
  File "/scratch/qzhu/github/CSP_BO/cspbo/Dot_mb.py", line 446, in K_ff
    tmp = (dx1dr[:,None,:,None,:] * d2D_dx1dx2[:,:,:,:,None]).sum(axis=(2)) #ijlm
  File "cupy/core/core.pyx", line 940, in cupy.core.core.ndarray.__mul__
  File "cupy/core/_kernel.pyx", line 836, in cupy.core._kernel.ufunc.__call__
  File "cupy/core/_kernel.pyx", line 340, in cupy.core._kernel._get_out_args
  File "cupy/core/core.pyx", line 134, in cupy.core.core.ndarray.__init__
  File "cupy/cuda/memory.pyx", line 518, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1085, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1106, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 934, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 949, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
  File "cupy/cuda/memory.pyx", line 697, in cupy.cuda.memory._try_malloc
cupy.cuda.memory.OutOfMemoryError: out of memory to allocate 1775955456 bytes (total 9663106048 bytes)

A possible fix is to split the training data to a few parts

yanxon commented 4 years ago

@qzhu2017 The results just came in. For me, it seems like the calculation stopped at step 1154 with: Kernel: 50.000**2 *Dot(length=4.118) 4 energy (0.005) 154 forces (0.050)

I believe the computation exits because of cuda running out of memory as well.

yanxon commented 4 years ago

Hi @qzhu2017

Let's say if I want to continue PtH2O calculation, do I just use this command?

python3 example_sampling.py models/test.json database/PtHO.db > log 
qzhu2017 commented 4 years ago

@yanxon You can also modify the script to make sure that you don't start with structure 0.

yanxon commented 4 years ago

@yanxon You can also modify the script to make sure that you don't start with structure 0.

I see. This is just modifying the range of for loop.