libAtoms / QUIP

libAtoms/QUIP molecular dynamics framework: https://libatoms.github.io
348 stars 122 forks source link

using universal SOAP and parse_string error #387

Open jungsdao opened 2 years ago

jungsdao commented 2 years ago

Hello! Thank you for developing GAP workflow (& code) which are indeed useful. I want to incorporate universal SOAP in my GAP training workflow. I have four elements in my system (organic molecule adsorption on Rh metal surface) Using universal SOAP, I've got following as suggested SOAP hyperparameters for my system.

{1: [{'cutoff': 1.2, 'cutoff_transition_width': 0.3, 'atom_gaussian_width': 0.15}, {'cutoff': 1.8, 'cutoff_transition_width': 0.45, 'atom_gaussian_width': 0.23}, {'cutoff': 2.7, 'cutoff_transition_width': 0.68, 'atom_gaussian_width': 0.34}], 6: [{'cutoff': 2.1, 'cutoff_transition_width': 0.53, 'atom_gaussian_width': 0.26}, {'cutoff': 3.2, 'cutoff_transition_width': 0.79, 'atom_gaussian_width': 0.39}], 8: [{'cutoff': 1.7, 'cutoff_transition_width': 0.43, 'atom_gaussian_width': 0.21}, {'cutoff': 2.6, 'cutoff_transition_width': 0.64, 'atom_gaussian_width': 0.32}, {'cutoff': 3.8, 'cutoff_transition_width': 0.96, 'atom_gaussian_width': 0.48}], 45: [{'cutoff': 2.7, 'cutoff_transition_width': 0.68, 'atom_gaussian_width': 0.34}, {'cutoff': 4.1, 'cutoff_transition_width': 1.0, 'atom_gaussian_width': 0.51}, {'cutoff': 6.1, 'cutoff_transition_width': 1.5, 'atom_gaussian_width': 0.76}]}

To implement these double or triple soap within command line, I used customized python code to print out corresponding universal SOAP as following... (This also includes 2-body potential )

gap_fit default_sigma={0.001 0.05 0 0} do_copy_at_file=F sparse_separate_file=F gp_file=GAP_2b_soap_iter_0.xml e0={H:-13.7212718188775:C:-1030.41241330161:O:-2045.24380597435:Rh:-131618.235069883} core_param_file=/u/hjung/1_ML/1_GAP/glue_baseline.xml core_ip_args={IP Glue} at_file=input_training_data_iter_0.xyz energy_parameter_name=F_el force_parameter_name=forces_dft gap={distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{1 1}} : distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{1 8}} : distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{6 6}} : distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{6 8}} : distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{8 45}} : distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{45 45}} : distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{6 45}} : distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{1 45}} : distance_Nb order=2 cutoff=5.0 delta=0.034 covariance_type=ard_se n_sparse=15 theta_uniform=1.0 sparse_method=uniform add_species=F Z={{1 6}} : soap cutoff=1.2 l_max=3 n_max=9 atom_sigma=0.15 cutoff_transition_width=0.3 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=1 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=1.8 l_max=3 n_max=9 atom_sigma=0.23 cutoff_transition_width=0.45 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=1 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=2.7 l_max=3 n_max=9 atom_sigma=0.34 cutoff_transition_width=0.68 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=1 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=2.1 l_max=3 n_max=9 atom_sigma=0.26 cutoff_transition_width=0.53 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=6 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=3.2 l_max=3 n_max=9 atom_sigma=0.39 cutoff_transition_width=0.79 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=6 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=1.7 l_max=3 n_max=9 atom_sigma=0.21 cutoff_transition_width=0.43 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=8 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=2.6 l_max=3 n_max=9 atom_sigma=0.32 cutoff_transition_width=0.64 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=8 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=3.8 l_max=3 n_max=9 atom_sigma=0.48 cutoff_transition_width=0.96 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=8 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=2.7 l_max=3 n_max=9 atom_sigma=0.34 cutoff_transition_width=0.68 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=45 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=4.1 l_max=3 n_max=9 atom_sigma=0.51 cutoff_transition_width=1.0 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=45 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points : soap cutoff=6.1 l_max=3 n_max=9 atom_sigma=0.76 cutoff_transition_width=1.5 covariance_type=dot_product delta=0.034 zeta=4 add_species=F n_species=4 Z=45 species_Z={{1 6 8 45}} n_sparse=2000 sparse_method=cur_points}

However, after running this command, I'm encountering unusual error as follow `libAtoms::Hello World: 21/01/2022 08:53:47 libAtoms::Hello World: git version https://github.com/libAtoms/QUIP.git,156fe9109-dirty libAtoms::Hello World: QUIP_ARCH linux_x86_64_gfortran_openmp libAtoms::Hello World: compiled on Aug 4 2021 at 17:49:35 libAtoms::Hello World: OpenMP parallelisation with 1 threads libAtoms::Hello World: OMP_STACKSIZE=256M libAtoms::Hello World: Random Seed = 32027292 libAtoms::Hello World: global verbosity = 0

Calls to system_timer will do nothing by default

SYSTEM ABORT: Traceback (most recent call last) File "/u/hjung/Softwares/QUIP/src/libAtoms/System.f95", line 1231 kind unspecified parse_string ran out of space for fields`

I assume my command line is too long to be digested by gap_fit code. Is this correct? I goal is to correctly use universal SOAP for my system, but I'm not quite sure this is way to go. It would be thankful if you could suggest me how to solve this issue or guide me correct way of using universal SOAP in this case. Many thanks in advance!!

jungsdao commented 2 years ago

I'm pretty sure that the reason for this error is the length of command line. Because I randomly decreased triple SOAP to double SOAP for one element and it worked without error. However, for hydrogen, oxygen, Rhodium, universal SOAP gives three hyperparameter sets as suggestion. When generating universal SOAP function, setting no_extra_inner=True seems to give only two hyperparameter sets for each element. Does it make sense to use this setting?

gabor1 commented 2 years ago

We don't yet have a huge experience with universal SOAP strings in a diverse array of systems. Looking at your actual string, I would certainly try cutting out the innermost soap (the 1.2 cutoff one).

bernstei commented 2 years ago

I agree with that recommendation on physics ground, but it would be easy to also read some (just descriptors?) or all of the command line from a file to avoid such string length limits. We could do something like add (without changing any top level program), in params_read_args, a "from_file" argument, which is parsed, then call params_read_file, and then parse the actual command line args.

jameskermode commented 2 years ago

Looking at the error, I think it's the number of fields not the string length which is being exceeded, so this could perhaps just be bumped up a bit

jungsdao commented 2 years ago

Thank you all for the comments :) By setting no_extra_inner=True for SOAP_hypers function call, it gives only double SOAP which eliminates those with short cutoff.

{1: [{'cutoff': 1.8, 'cutoff_transition_width': 0.45, 'atom_gaussian_width': 0.23}, {'cutoff': 2.7, 'cutoff_transition_width': 0.68, 'atom_gaussian_width': 0.34}], 6: [{'cutoff': 2.1, 'cutoff_transition_width': 0.53, 'atom_gaussian_width': 0.26}, {'cutoff': 3.2, 'cutoff_transition_width': 0.79, 'atom_gaussian_width': 0.39}], 8: [{'cutoff': 2.6, 'cutoff_transition_width': 0.64, 'atom_gaussian_width': 0.32}, {'cutoff': 3.8, 'cutoff_transition_width': 0.96, 'atom_gaussian_width': 0.48}], 45: [{'cutoff': 4.1, 'cutoff_transition_width': 1.0, 'atom_gaussian_width': 0.51}, {'cutoff': 6.1, 'cutoff_transition_width': 1.5, 'atom_gaussian_width': 0.76}]}

I believe this is quite easy way around for this problem without losing physical aspect.

I'm not sure I can write a file with all keywords for GAP fitting in current implementation of the code. If things can be adopted for reading input file and parsing it, that would be great.

Also regarding other comments, what does it mean that I have exceeded number of fields? I'm not sure how I can avoid such problem..

jameskermode commented 2 years ago

What I meant was that by looking at this error:

SYSTEM ABORT: Traceback (most recent call last)
File "/u/hjung/Softwares/QUIP/src/libAtoms/System.f95", line 1231 kind unspecified
parse_string ran out of space for fields`

you can see exactly where to look in the code: https://github.com/libAtoms/QUIP/blob/public/src/libAtoms/System.f95#L1244

and then with a bit of detective work you could note that the maximum number of fields for parameters is set to 300 here: https://github.com/libAtoms/QUIP/blob/public/src/libAtoms/ParamReader.f95#L56

If you compiled QUIP+GAP yourself you could increase that value and recompile - but only if you really need to do this, your proposed solution sounds fine.