mzjb / DeepH-pack

Deep neural networks for density functional theory Hamiltonian.
GNU Lesser General Public License v3.0
246 stars 51 forks source link

Failed to preprocess #36

Closed wangchengDon closed 1 year ago

wangchengDon commented 1 year ago

To Respected developer, When i run the step 2-proprocess in TBB,there is a error-Failed to preprocess.When I was searching for errors in the source code, I discovered this piece of code in preprocess.py in line 89.
if capture_output.returncode != 0: with open(os.path.join(os.path.abspath(relpath), 'error.log'), 'w') as f: f.write(f'[stdout of cmd "{cmd}"]:\n\n{capture_output.stdout}\n\n\n' f'[stderr of cmd "{cmd}"]:\n\n{capture_output.stderr}') print(f'\nFailed to preprocess: {abspath}, ' f'log file was saved to {os.path.join(os.path.abspath(relpath), "error.log")}') my preprocess.ini is [basic] raw_dir = /public/wcd/twisted/example/work_dir/dataset/raw processed_dir = /public/wcd/twisted/example/work_dir/dataset/processed target = hamiltonian interface = openmx multiprocessing = 0 local_coordinate = True get_S = False [interpreter] python_interpreter = /public/apps/miniconda3/bin/python3.9 julia_interpreter = /public/apps/julia-1.5.4/bin/juila

[graph] radius = -1.0 create_from_DFT = True I want to ask how i can solve this problem best regard.

mzjb commented 1 year ago

Hi,

Could you provide the output of deeph-preprocess? This should include the specific path to the error.log file, which contains information about the error.

Alternatively, you could try running the following command to see the contents of all error.log files:

cat /public/wcd/twisted/example/work_dir/dataset/processed/*/error.log

Once you've located the error.log file, please copy and paste its contents here so we can know what went wrong.

wangchengDon commented 1 year ago

I tried versions 1.5.4 and 1.6.6, but they did not run successfully. I checked the file location of 1.5.4 correctly and used the pwd instruction to check the location. What should I do to solve this error? error(1.5.4) is : error(1.5.4).log [stdout of cmd "/public/apps/julia-1.5.4/bin/juila /public/apps/miniconda3/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl --input_dir /public/wcd/twisted/example/work_dir/dataset/raw/0 --output_dir /public/wcd/twisted/example/work_dir/dataset/processed/0 --save_overlap false"]: [stderr of cmd "/public/apps/julia-1.5.4/bin/juila /public/apps/miniconda3/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl --input_dir /public/wcd/twisted/example/work_dir/dataset/raw/0 --output_dir /public/wcd/twisted/example/work_dir/dataset/processed/0 --save_overlap false"]: /bin/sh: /public/apps/julia-1.5.4/bin/juila: 没有那个文件或目录 error(1.6.6)is: [stdout of cmd "/public/apps/julia-1.5.4/bin/julia /public/apps/miniconda3/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl --input_dir /public/wcd/twisted/example/work_dir/dataset/raw/0 --output_dir /public/wcd/twisted/example/work_dir/dataset/processed/0 --save_overlap false"]: [stderr of cmd "/public/apps/julia-1.5.4/bin/julia /public/apps/miniconda3/lib/python3.9/site-packages/deeph/preprocess/openmx_get_data.jl --input_dir /public/wcd/twisted/example/work_dir/dataset/raw/0 --output_dir /public/wcd/twisted/example/work_dir/dataset/processed/0 --save_overlap false"]: ERROR: LoadError: ArgumentError: Package StaticArrays not found in current path:

mzjb commented 1 year ago

It appears that you do not have the required Julia package installed. Run

/public/apps/julia-1.5.4/bin/juila

and then execute the command:

import Pkg
Pkg.add("StaticArrays")
Pkg.add("Arpack")
Pkg.add("HDF5")
Pkg.add("ArgParse")
Pkg.add("JLD")
Pkg.add("JSON")
Pkg.add("IterativeSolvers")
Pkg.add("DelimitedFiles")
Pkg.add("LinearMaps")

See details in https://github.com/mzjb/DeepH-pack#julia

wangchengDon commented 1 year ago

Thank you very much for your answer. After installing the relevant environment you mentioned, the program seems to be running normally. But I found that the environment you mentioned is missing Pardiso.jl. After I simply ran add Pardiso.jl, the program can run normally, but I did not follow the installation instructions. May I ask what impact this will have?

mzjb commented 1 year ago

@wangchengDon

Pardiso.jl is used for solving sparse linear equations, and is utilized in DeepH-pack for computing a few number of eigenvalues of large sparse matrices. Therefore, it is not required for deeph-preprocess. Only when running deeph-inference with dense_calc set to False, which involves computing a small number of eigensolutions for a large sparse matrix, will Pardiso.jl be used.

wangchengDon commented 1 year ago

Thank you! I am very grateful for your kind answer

wangchengDon commented 1 year ago

Hello dear developer, I have a question in train. How should I set parameters to support parallel operations after running? I am setting device=CPU num_ After threads=48, only one core was used in the calculation, rather than one node and forty-eight cores

mzjb commented 1 year ago

@wangchengDon

The parameter num_threads only affects the openmp threads used by PyTorch, and increasing this value may not lead to a significant improvement in efficiency. Therefore, I suggest not setting the thread count too high.

The parallel training and inference for DeepH are still in development and will be released soon.

wangchengDon commented 1 year ago

Thank you very much for your patient and meticulous answer