QuantumLab-ZY / HamGNN

An E(3) equivariant Graph Neural Network for predicting electronic Hamiltonian matrix
GNU General Public License v3.0
44 stars 12 forks source link

After replacing the demonstration structure graph file with my own structure graph file, 'band_cal' fails to function properly. #6

Open newplay opened 6 months ago

newplay commented 6 months ago

Dear Yang Zhong:

Below is my configuration file of graph_data_gen:

=================================================================
nao_max: 19
graph_data_save_path: '/home/zjlin/ML_work/HamGNN/Bilayer_MoS2_Demo/database/work_dir/dataset/graph/MoS2_21_79/AA/'
read_openmx_path: '/home/zjlin/ML_work/HamGNN/Bilayer_MoS2_Demo/database/openmx/MoS2_21_79/AA'
max_SCF_skip: 200
scfout_paths: '/home/zjlin/ML_work/HamGNN/Bilayer_MoS2_Demo/database/openmx/MoS2_21_79/AA' # Directory containing the .scfout file calculated by openmx/openmx_postprocess, or a wildcard directory name to match multiple directories
dat_file_name: 'openmx_in.dat'
std_file_name: null # None if no openmx computation is performed
scfout_file_name: 'overlap.scfout' # If the openmx self-consistent Hamiltonian is not required as the target, "overlap.scfout" can be used instead.
soc_switch: False # Generate graph_data.npz for SOC (True) or Non-SOC (False) Hamiltonian
=================================================================

For the read_openmx_path, I'm not sure exactly what it means, but I assume it is the path where the 'overlap.scfout' file is generated by openmx.

I observed a difference between this configuration file and the default one:

=================================================================
nao_max: 19
graph_data_save_path: '/data/home/yzhong/EPC_test/Hg/alpha/Training'
read_openmx_path: '/data/home/yzhong/NPJ/HamGNN/utils_openmx/read_openmx'
max_SCF_skip: 200
scfout_paths: '/data/home/yzhong/EPC_test/Hg/alpha/Training/perturb_calculated/alpha-Hg_*' # Directory containing the .scfout file calculated by openmx/openmx_postprocess, or a wildcard directory name to match multiple directories
dat_file_name: 'openmx.dat'
std_file_name: 'openmx.std' # None if no openmx computation is performed
scfout_file_name: 'Hg.scfout' # If the openmx self-consistent Hamiltonian is not required as the target, "overlap.scfout" can be used instead.
soc_switch: False # Generate graph_data.npz for SOC (True) or Non-SOC (False) Hamiltonian
=================================================================

The differences are in read_openmx_path and std_file_name. I set the second one as "null" because I did not perform the standard openmx calculation.

When I use the graph file from my process, the band_cal shuts down immediately:

(HamGNN) [zjlin@newmaster band]$ band_cal --config band.yaml 
(HamGNN) [zjlin@newmaster band]$

However, when using the demo graph, the expected output is:

(HamGNN) [zjlin@newmaster band]$ band_cal --config band.yaml 
----- k_path report begin ----------
real-space lattice vectors
 [[ 99.23713   0.        0.     ]
 [-49.61856  85.94187   0.     ]
 [  0.        0.       56.69178]]
k-space metric tensor
 [[1.35391e-04 6.76956e-05 0.00000e+00]
 [6.76956e-05 1.35391e-04 0.00000e+00]
 [0.00000e+00 0.00000e+00 3.11143e-04]]
...

What could be the issue with my process?

Best regards, TzuChing

newplay commented 6 months ago

Dear Yang Zhong:

Below is my configuration file of graph_data_gen:

=================================================================
nao_max: 19
graph_data_save_path: '/home/zjlin/ML_work/HamGNN/Bilayer_MoS2_Demo/database/work_dir/dataset/graph/MoS2_21_79/AA/'
read_openmx_path: '/home/zjlin/ML_work/HamGNN/Bilayer_MoS2_Demo/database/openmx/MoS2_21_79/AA'
max_SCF_skip: 200
scfout_paths: '/home/zjlin/ML_work/HamGNN/Bilayer_MoS2_Demo/database/openmx/MoS2_21_79/AA' # Directory containing the .scfout file calculated by openmx/openmx_postprocess, or a wildcard directory name to match multiple directories
dat_file_name: 'openmx_in.dat'
std_file_name: null # None if no openmx computation is performed
scfout_file_name: 'overlap.scfout' # If the openmx self-consistent Hamiltonian is not required as the target, "overlap.scfout" can be used instead.
soc_switch: False # Generate graph_data.npz for SOC (True) or Non-SOC (False) Hamiltonian
=================================================================

For the read_openmx_path, I'm not sure exactly what it means, but I assume it is the path where the 'overlap.scfout' file is generated by openmx.

I observed a difference between this configuration file and the default one:

=================================================================
nao_max: 19
graph_data_save_path: '/data/home/yzhong/EPC_test/Hg/alpha/Training'
read_openmx_path: '/data/home/yzhong/NPJ/HamGNN/utils_openmx/read_openmx'
max_SCF_skip: 200
scfout_paths: '/data/home/yzhong/EPC_test/Hg/alpha/Training/perturb_calculated/alpha-Hg_*' # Directory containing the .scfout file calculated by openmx/openmx_postprocess, or a wildcard directory name to match multiple directories
dat_file_name: 'openmx.dat'
std_file_name: 'openmx.std' # None if no openmx computation is performed
scfout_file_name: 'Hg.scfout' # If the openmx self-consistent Hamiltonian is not required as the target, "overlap.scfout" can be used instead.
soc_switch: False # Generate graph_data.npz for SOC (True) or Non-SOC (False) Hamiltonian
=================================================================

The differences are in read_openmx_path and std_file_name. I set the second one as "null" because I did not perform the standard openmx calculation.

When I use the graph file from my process, the band_cal shuts down immediately:

(HamGNN) [zjlin@newmaster band]$ band_cal --config band.yaml 
(HamGNN) [zjlin@newmaster band]$

However, when using the demo graph, the expected output is:

(HamGNN) [zjlin@newmaster band]$ band_cal --config band.yaml 
----- k_path report begin ----------
real-space lattice vectors
 [[ 99.23713   0.        0.     ]
 [-49.61856  85.94187   0.     ]
 [  0.        0.       56.69178]]
k-space metric tensor
 [[1.35391e-04 6.76956e-05 0.00000e+00]
 [6.76956e-05 1.35391e-04 0.00000e+00]
 [0.00000e+00 0.00000e+00 3.11143e-04]]
...

What could be the issue with my process?

Best regards, TzuChing

I also try to put the std_filein configuration , and got the same result.

QuantumLab-ZY commented 6 months ago

Dear TzuChing,

read_openmx_path is the path to the read_openmx executable binary, which can directly read .scfout binaries produced by opnmx or openmx_postprocess. read_openmx is called by graph_data_gen when generating the graph_data.npz. std_file_name refers to the openmx standard output file that contains information typically printed on the screen during openmx execution, including energy information for each iteration step. graph_data_gen will read the total number of SCF iterations from the .std file to determine if the SCF calculation has converged.

Best wishes, Yang Zhong

newplay commented 6 months ago

Dear Yang Zhong,

Thank you for your prompt response. Upon further investigation, I determined that the unsuccessful result was attributed to my use of an input file in a different format (OpenMX), and it appears that the precision of the float numbers in Atoms.UnitVectors might be a contributing factor (although I'm not entirely certain, as when I print the lattice matrix, I receive no output, leading me to suspect an issue). After reviewing the source code in your scripts, particularly graph_data_gen.py and util.py, I have identified the problem. To rectify this, I have switched to using the format provided by the poscar2openmx.py tool.

Consequently, I have achieved an output (albeit incorrect). I plan to open another issue to delve into and discuss this problem in detail.

Best regards, Tzuching