QuantumLab-ZY / HamGNN

An E(3) equivariant Graph Neural Network for predicting electronic Hamiltonian matrix
GNU General Public License v3.0
63 stars 15 forks source link

Regarding the issue of mismatched shapes encountered during the processing of loading the Hamiltonian matrix. #5

Open newplay opened 11 months ago

newplay commented 11 months ago

Dear Developers,

While running the 'twist bilayer MoS2' demo, I encountered an error during the 'band_col' processing. Below is my '.yaml' file for the test and band:

ataset_params:
  batch_size: 1
  split_file: null
  test_ratio: 0.1
  train_ratio: 0.8
  val_ratio: 0.1
  graph_data_path: ./Examples/Moire_twisted_bilayer_MoS2 # Directory where graph_data.npz is located
#......
profiler_params:
  progress_bar_refresh_rat: 1
  train_dir: ./Example/result #The folder for saving training information and prediction results. This directory can be read by tensorboard to monitor the training process.
#......
setup:
  GNN_Net: HamGNN_pre
  accelerator: null
  ignore_warnings: true
  checkpoint_path: ./network_weights_bilayer_MoS2.ckpt # Path to the model weights file
  load_from_checkpoint: false
  resume: false
  num_gpus: null # null: use cpu; [i]: use the ith GPU device
  precision: 32
  property: hamiltonian
  stage: test # fit: training; test: inference

and band:

nao_max: 14
graph_data_path: '/home/zjlin/work/ML_work/HamGNN/tw_MoS2/bilayer_MoS2/Examples/Moire_twisted_bilayer_MoS2/graph_data.npz'
hamiltonian_path: '/home/zjlin/work/ML_work/HamGNN/tw_MoS2/bilayer_MoS2/Examples/Moire_twisted_bilayer_MoS2/version_0/target_hamiltonian.npy'
nk: 60          # the number of k points
save_dir: '/home/zjlin/work/ML_work/HamGNN/tw_MoS2/bilayer_MoS2/Example/result/target' # The directory to save the results
strcture_name: 'Si'  # The name of each cif file saved is strcture_name_idx.cif after band calculation
soc_switch: False
auto_mode: True # If the auto_mode is used, users can omit providing k_path and label, as the program will automatically generate them based on the crystal symmetry.
k_path: [[0.,0.,-0.5],[0.,0.,0.0],[0.,0.,0.5]]
label: ['$Mbar$','$G$','$M$'] # The lable for each k points in K_path

there's the error message:

band_cal --config band.yaml 
Traceback (most recent call last):
  File "/home/zjlin/work/anaconda3/envs/HamGNN/bin/band_cal", line 33, in <module>
    sys.exit(load_entry_point('HamGNN==0.1.0', 'console_scripts', 'band_cal')())
  File "/home/zjlin/work/anaconda3/envs/HamGNN/lib/python3.9/site-packages/HamGNN-0.1.0-py3.9.egg/utils_openmx/band_cal.py", line 248, in main
    H = np.load(hamiltonian_path).reshape(-1, nao_max, nao_max)
ValueError: cannot reshape array of size 42642764 into shape (14,14)

Could you please clarify the meaning of nao_max in the code?

By the way, my partner and I observed that the angle in the demo you open-sourced should be $3.48^\circ$ instead of $5.09^\circ$ Thank you. Best regards, Tzuching

QuantumLab-ZY commented 11 months ago

Dear Tzuching,

Thanks for your reminder. The angle in the demo of should be $3.48^\circ$. Regarding your query about the nao_max parameter in the code, nao_max stands for the maximum number of atomic orbitals (AOs) considered in the calculation. It denotes the highest index or count of atomic orbitals used within the specific systems. The nao_max parameter for MoS2 should be 19 because the basis set for Mo is Mo-s3p2d2, which consists of 19 orbitals. The following is a detailed explanation.

The real-space Hamiltonian matrix fitted by HamGNN is based on numerical atomic orbitals. The number and type of atomic orbital bases required in the actual calculation vary for different element types. For example, in OpenMX software, the standard basis set for a hydrogen atom only includes two s-type orbitals and one p-type orbital, with three suborbitals (px, py, pz) under the p-type orbital. Therefore, there are only four orbitals in total for a hydrogen atom represented as H−s2p1. Similarly, a carbon atom's standard basis set can be represented as C − s2p2d1 with two s-type orbitals, two p-type orbitals, and one d-type orbital requiring 13 orbitals in total. Sodium (Na) elements have a standard orbital basis set of Na-s3p2d1 and require 14 atomic orbitals. In OpenMX software, general short-period elements typically require at most 14 atomic orbitals as their standard basis set; however, some transition family elements may require up to 19 atomic orbitals as their basis set (e.g., Mo-s3p2d2). The atomic orbital bases in OpenMX are generally arranged in the following order:

s1, s2, s3, px1, py1, pz1, px2, py2, pz2, d3z^2-r^2, dx^2-y^2, dxy, dxz, dyz,d3z^2-r^2, dx^2-y^2, dxy, dxz, dyz

basis_def uses this list to define the basis set for each element. For example, H corresponds to [0,1,3,4,5], which means s1, s2, px1, py1, pz1, The basis_def list [0,1,3,4,5,6,7,8,9,10,11,12,13] for carbon said its basis set include s1, s2, px1, py1, pz1, px2, py2, pz2, d3z^2-r^2, dx^2-y^2, dxz, dyz. And so on for other elements. HamGNN always aligns the length of the basis set for different atoms in a system to 14 or 19 by padding zeros. This ensures that the Hamiltonian matrix shape for any atom pair ij is 14^2 or 19^2. This parameter is referred to as nao_max (maximum number of atomic orbitals) in HamGNN. The base sets for each element in OpenMX can be found on pages 55 and 56 of the OpenMX manual.

Best regards, Yang Zhong

openmx3.9_basis.pdf

newplay commented 11 months ago

Dear Yang Zhong, Thank you for your response. I have understood and resolved the issue. However, this approach seems to consume a significant amount of memory space, especially for small-angle twisted 2D materials. Perhaps I will try to optimize this issue. Best regards, TzuChing