Xiaoxun-Gong / DeepH-E3

MIT License
60 stars 16 forks source link

The wrong result at eval term in "Bilayer Graphene demo". #7

Open newplay opened 11 months ago

newplay commented 11 months ago

Dear developers,

I encountered an issue while using the DeepH-E3 model. I am using the dataset and config settings from the demo and have confirmed that there are no differences (except for file locations). The training process went smoothly, and here are my final results:

Epoch #2499   | Time: 04d 04h 37m  | LR: 1.88e-04  | Epoch time: 142.84  | Train loss: 4.94e-07  | Val loss: 4.95e-07
Target 000 has the maximum loss of 1.31e-06; Target 004 has the minimum loss of 2.50e-07

Training finished.

------- Testing network on the test set -------
Using the best model at epoch 2497 with a val_loss of 4.81407614792638e-07
Testing...
Test finished, taking 13.97 seconds.
Test loss: 4.9187e-07
Test results saved to "test_result.h5". They can be analyzed using "deephe3-analyze.py".
Test report written to: "test_report.txt".

The results indicate no overfitting, and the performance seems promising. However, when predicting the Hamiltonian (python deephe3-eval.py eval.ini), the results differ significantly from the openmx_DFT calculations.

Example 1: Below is an example involving the calculation of rotated graphene at 21.79 degrees: image As seen, while there's some overlap, the E3 model's predictions exhibit ghost bands.

Example 2: Next is another example, again involving rotated graphene, this time at 13.17 degrees: image This is entirely incorrect; the Dirac cone has even vanished!

In another issue, I noticed your response mentioning that using the inference function from deeph-pack can perform predictions more efficiently. Thus, I utilized the inference.inifrom deeph-pack. Here's the content of my inference.ini file:

[basic]
dense_calc = False
OLP_dir = /home/zjlin/tbg_Deep_ex/work_dir/olp/TBG_13.17/olp
work_dir = /home/zjlin/tbg_Deep_ex/work_dir/Deep-E3/TBG_13.17/inference/
structure_file_name = POSCAR
interface = openmx
task = [5]
sparse_calc_config = /home/zjlin/tbg_Deep_ex/work_dir/Deep-E3/TBG_13.17/inference/band.json
trained_model_dir = /home/zjlin/DeepH-E3_model/TBG_pytorch_191_cu111/
restore_blocks_py = False
[interpreter]
julia_interpreter = /home/zjlin/julia-1.8.5/bin/julia

[graph]
radius = 7.2
create_from_DFT = True

Do you have any suggestions about my execution? I would greatly appreciate it. Thank you for your assistance!

TzuChing

newplay commented 11 months ago

By the way , I'm aware that the difference in Example 1 comes from varying Fermi levels. However, I have attempted to align the Dirac point, and the fact remains that it is still not entirely accurate.

newplay commented 11 months ago

Based on my testing, I found that changing the coordinate format of openmx_in.dat to fraction coordinatesin the calculation of the overlap matrix (olp stage) allows the correct results to be generated.

Xiaoxun-Gong commented 11 months ago

Hi, are the fractional coordinates of the atoms in your systems within the interval [0, 1)? If not, this might be the problem. All the output fractional coordinates of OpenMX are converted to be within the interval [0, 1), which are then read by DeepH-E3. But the internal fractional coordinates used by OpenMX in its calculations (e.g., calculating overlap matrices) are not always converted to that range. More specifically, when the input is Cartesian coordinates, they are not converted, but when the input is fractional coordinates, they are. So, in the Cartesian case, the overlap matrices read by DeepH-E3 does not match the coordinates read by DeepH-E3, and thus the results are completely wrong.

In short, make sure all the fractional coordinates are within [0, 1).