block-hczhai / block2-preview

Efficient parallel quantum chemistry DMRG in MPO formalism
GNU General Public License v3.0
67 stars 23 forks source link

Segmentation fault with DMRG-SC-NEVPT2 #90

Closed alarese closed 6 months ago

alarese commented 6 months ago

Hi there, I am having an issue regarding a segmentation fault with DMRG-SC-NEVPT2 when using the compress_approx approach similar to issue #14. I get the error message:

/bin/sh: line 1: 3522736 Segmentation fault      (core dumped) /storage/home/all6130/my_pyscf_env/bin/block2main /scratch/all6130/NEVPT2_test/DMRG/nevpt2_0/dmrg.conf > /scratch/all6130/NEVPT2_test/DMRG/nevpt2_0/dmrg.out
Traceback (most recent call last):
  File "/storage/home/all6130/my_pyscf_env/lib/python3.11/site-packages/pyscf/dmrgscf/nevpt_mpi.py", line 509, in <module>
    nevpt_integral_mpi(sys.argv[1],sys.argv[2],sys.argv[3],sys.argv[4])
  File "/storage/home/all6130/my_pyscf_env/lib/python3.11/site-packages/pyscf/dmrgscf/nevpt_mpi.py", line 306, in nevpt_integral_mpi
    f = open(os.path.join(nevpt_scratch_mpi, 'node0', 'Va_%d'%root), 'r')
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/scratch/all6130/NEVPT2_test/DMRG/nevpt2_0/node0/Va_0'
ERROR:  /storage/home/all6130/my_pyscf_env/lib/python3.11/site-packages/pyscf/dmrgscf/nevpt_mpi.py /scratch/all6130/NEVPT2_test/DMRG/nevpt_perturb_integral /storage/home/all6130/my_pyscf_env/bin/block2main /scratch/all6130/NEVPT2_test/DMRG /scratch/all6130/NEVPT2_test/DMRG
Traceback (most recent call last):
  File "/scratch/all6130/NEVPT2_test/DMRG/DMRG.py", line 53, in <module>
    nevpt_e1 = mrpt.NEVPT(mycas,root=0).compress_approx(maxM=3000).kernel()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/storage/home/all6130/my_pyscf_env/lib/python3.11/site-packages/pyscf/mrpt/nevpt2.py", line 785, in kernel
    perturb_file = DMRG_COMPRESS_NEVPT(self, maxM=self.maxM, root=self.root,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/storage/home/all6130/my_pyscf_env/lib/python3.11/site-packages/pyscf/dmrgscf/nevpt_mpi.py", line 250, in DMRG_COMPRESS_NEVPT
    raise err
  File "/storage/home/all6130/my_pyscf_env/lib/python3.11/site-packages/pyscf/dmrgscf/nevpt_mpi.py", line 247, in DMRG_COMPRESS_NEVPT
    subprocess.check_call(cmd, shell=True)
  File "/storage/home/all6130/my_pyscf_env/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ' /storage/home/all6130/my_pyscf_env/lib/python3.11/site-packages/pyscf/dmrgscf/nevpt_mpi.py /scratch/all6130/NEVPT2_test/DMRG/nevpt_perturb_integral /storage/home/all6130/my_pyscf_env/bin/block2main /scratch/all6130/NEVPT2_test/DMRG /scratch/all6130/NEVPT2_test/DMRG' returned non-zero exit status 1.

Here is the DMRG section of my python file

dmrgscf.settings.BLOCKEXE = '/storage/home/all6130/my_pyscf_env/bin/block2main'
dmrgscf.settings.BLOCKEXE_COMPRESS_NEVPT = '/storage/home/all6130/my_pyscf_env/bin/block2main'
dmrgscf.settings.MPIPREFIX = ''

mycas = mcscf.CASCI(mf, norb, nelec)
mycas.fcisolver = dmrgscf.DMRGCI(mf.mol, maxM=2000, tol=1e-8)

mycas.fcisolver.runtimeDir = '/scratch/all6130/NEVPT2_test/DMRG'
mycas.fcisolver.scratchDirectory = '/scratch/all6130/NEVPT2_test/DMRG'
mycas.fcisolver.spin = 10
mycas.fcisolver.nroots = 1
mycas.fcisolver.threads = 40
mycas.fcisolver.conv_tol = 1e-8
mycas.fcisolver.memory = 900
mycas.verbose = 5
mycas.natorb = True
mycas.kernel(orbs)

nevpt_e1 = mrpt.NEVPT(mycas,root=0).compress_approx(maxM=3000).kernel()

Running /storage/home/all6130/my_pyscf_env/bin/block2main /scratch/all6130/NEVPT2_test/DMRG/nevpt2_0/dmrg.conf > /scratch/all6130/NEVPT2_test/DMRG/nevpt2_0/dmrg.out by itself seems like the program runs unlike #14 but ultimately stops during an 'Environment initialization' step with no error message. I am able to successfully run DMRG-SC-NEVPT2 for a small test system but am unable to run it for a larger system

hczhai commented 6 months ago

Thanks for your interest in using the block2 package.

As you said, the problem happens when you run for a large system. But there is no information in your script showing the specific system size. Please post the complete input script and the complete content of /scratch/all6130/NEVPT2_test/DMRG/nevpt2_0/dmrg.out so that there can be some context for me to analyze.

alarese commented 6 months ago

Sure, sorry about that, here is the input script:

#!/usr/bin/env python
import numpy
from pyscf import gto, scf, dft, mcscf, mrpt, dmrgscf
from pyscf.mcscf import avas
import mpi4py

mol=gto.Mole()
mol.atom = 'test.xyz'
mol.basis = 'def2-TZVP'
mol.verbose = 4
mol.spin = 10
mol.charge = -2
mol.build()

mf = scf.ROKS(mol)
mf.max_cycle = 500
mf.conv_tol = 1e-8
mf.xc = 'BP86'
# mf.chkfile = 'test.chk'
# mf = scf.newton(mf)
# mf.kernel()
chkfile = 'test.chk'
mol = scf.chkfile.load_mol(chkfile)
mf.__dict__.update(scf.chkfile.load(chkfile, 'scf'))

ao_labels = ['Fe 3d', 'S:1 3p']
norb, nelec, orbs = avas.avas(mf, ao_labels)

dmrgscf.settings.BLOCKEXE = '/storage/home/all6130/my_pyscf_env/bin/block2main'
dmrgscf.settings.BLOCKEXE_COMPRESS_NEVPT = '/storage/home/all6130/my_pyscf_env/bin/block2main'
dmrgscf.settings.MPIPREFIX = ''

mycas = mcscf.CASCI(mf, norb, nelec)
mycas.fcisolver = dmrgscf.DMRGCI(mf.mol, maxM=2000, tol=1e-8)
mycas.fcisolver.runtimeDir = '/scratch/all6130/NEVPT2_test/'
mycas.fcisolver.scratchDirectory = '/scratch/all6130/NEVPT2_test/'
mycas.fcisolver.spin = 10
mycas.fcisolver.nroots = 1
mycas.fcisolver.threads = 40
mycas.fcisolver.conv_tol = 1e-8
mycas.fcisolver.memory = 500
mycas.verbose = 5
mycas.natorb = True
mycas.kernel(orbs)

nevpt_e1 = mrpt.NEVPT(mycas,root=0).compress_approx(maxM=3000).kernel()

The structure test.xyz is from https://doi.org/10.1038/nchem.2041 Supplementary Table 1

Here is the output of dmrg.out from nevpt2_0:


********************************** INPUT START **********************************
nelec                                                           22
spin                                                            10
twodot_to_onedot                                                 4
orbitals                                                   FCIDUMP
maxiter                                                          6
sweep_tol                                               1.0000e-07
outputlevel                                                      2
hf_occ                                                    integral
num_thrds                                                       40
fullrestart                                                       
nevpt_state_num                                                  0
restart_mps_nevpt                                        16 82 410
mem                                                          500 g
schedule                  Sweep   0-   3 : Mmps =  3000 Noise =    0.0001 DavTol =    0.0001
                          Sweep   4-   5 : Mmps =  3000 Noise =         0 DavTol =     1e-07
irrep                                                            1
********************************** INPUT END   **********************************

SPIN ADAPTED - REAL DOMAIN - DOUBLE PREC
qc mpo type =  QCTypes.Conventional
 UseMainStack = 0 MinDiskUsage = 1 MinMemUsage = 0 IBuf = 0 OBuf = 0
 FPCompression: prec = 1.00e-16 chunk = 1024
 IMain = 0 B / 32.6 GB DMain = 0 B / 168 GB ISeco = 0 B / 14.0 GB DSeco = 0 B / 251 GB
 OpenMP = 1 TBB = 0 BLIS = 0 MKL = GNU 2021.0.4 SeqType = Tasked MKLIntLen = 4
 THREADING = 2 layers : Global | Operator BatchedGEMM 
 NUMBER : Global = 40 Operator = 40 Quanta = 0 MKL = 1
 COMPLEX = 1 SINGLE-PREC = 1 KSYMM = 0
loading reorder for restarting =  [12 13 14 15 11 10  7  8  4  3  5  0  6  9  1  2]
read integral finished 3.863091773004271
integral sym error =            0
MinMPOMemUsage =  False
full restart
MPS =  KRRRRRRRRRRRRRRR 0 2 < N=22 S=5 PG=0 >
GS INIT MPS BOND DIMS =       1     2     4     8    16    32    62   108   135   149   121    68    36    17     8     4     1
pre-mpo memory usage =  206 MB
build mpo start ...
build mpo finished ... Tread = 0.000 Twrite = 0.000 T = 0.036
simpl mpo start ...
simpl mpo finished ... Tread = 0.000 Twrite = 0.000 T = 0.040
GS MPO BOND DIMS =      21    28    39    54    73    96   123   154   123    96    73    54    39    28    21     1
build 1pdm mpo 3.976243476034142
build identity mpo 3.978148702008184
memory usage =  227 MB
para mpo finished 3.979164890013635
transform ref state to singlet embedding ...
transform ref state to singlet embedding finshed.

Build MPO | Nsites =    16 | Nterms =        256 | Algorithm = FastBIP | Cutoff = 1.00e-12
 Site =     0 /    16 .. Mmpo =     4 DW = 0.00e+00 NNZ =        4 SPT = 0.0000 Tmvc = 0.000 T = 0.000
 Site =     1 /    16 .. Mmpo =     6 DW = 0.00e+00 NNZ =        9 SPT = 0.6250 Tmvc = 0.000 T = 0.000
 Site =     2 /    16 .. Mmpo =     8 DW = 0.00e+00 NNZ =       13 SPT = 0.7292 Tmvc = 0.000 T = 0.000
 Site =     3 /    16 .. Mmpo =    10 DW = 0.00e+00 NNZ =       17 SPT = 0.7875 Tmvc = 0.000 T = 0.000
 Site =     4 /    16 .. Mmpo =    12 DW = 0.00e+00 NNZ =       21 SPT = 0.8250 Tmvc = 0.000 T = 0.000
 Site =     5 /    16 .. Mmpo =    14 DW = 0.00e+00 NNZ =       25 SPT = 0.8512 Tmvc = 0.000 T = 0.000
 Site =     6 /    16 .. Mmpo =    16 DW = 0.00e+00 NNZ =       29 SPT = 0.8705 Tmvc = 0.000 T = 0.000
 Site =     7 /    16 .. Mmpo =    18 DW = 0.00e+00 NNZ =       33 SPT = 0.8854 Tmvc = 0.000 T = 0.000
 Site =     8 /    16 .. Mmpo =    16 DW = 0.00e+00 NNZ =      145 SPT = 0.4965 Tmvc = 0.000 T = 0.000
 Site =     9 /    16 .. Mmpo =    14 DW = 0.00e+00 NNZ =       29 SPT = 0.8705 Tmvc = 0.000 T = 0.000
 Site =    10 /    16 .. Mmpo =    12 DW = 0.00e+00 NNZ =       25 SPT = 0.8512 Tmvc = 0.000 T = 0.000
 Site =    11 /    16 .. Mmpo =    10 DW = 0.00e+00 NNZ =       21 SPT = 0.8250 Tmvc = 0.000 T = 0.000
 Site =    12 /    16 .. Mmpo =     8 DW = 0.00e+00 NNZ =       17 SPT = 0.7875 Tmvc = 0.000 T = 0.000
 Site =    13 /    16 .. Mmpo =     6 DW = 0.00e+00 NNZ =       13 SPT = 0.7292 Tmvc = 0.000 T = 0.000
 Site =    14 /    16 .. Mmpo =     4 DW = 0.00e+00 NNZ =        9 SPT = 0.6250 Tmvc = 0.000 T = 0.000
 Site =    15 /    16 .. Mmpo =     1 DW = 0.00e+00 NNZ =        4 SPT = 0.0000 Tmvc = 0.000 T = 0.000
Ttotal =      0.001 Tmvc-total = 0.000 MPO bond dimension =    18 MaxDW = 0.00e+00
NNZ =          414 SIZE =         1912 SPT = 0.7835

=== nevpt compress core subspace 0 ===

Build MPO | Nsites =    16 | Nterms =       4352 | Algorithm = FastBIP | Cutoff = 1.00e-12
 Site =     0 /    16 .. Mmpo =    10 DW = 0.00e+00 NNZ =       10 SPT = 0.0000 Tmvc = 0.000 T = 0.001
 Site =     1 /    16 .. Mmpo =    24 DW = 0.00e+00 NNZ =       31 SPT = 0.8708 Tmvc = 0.000 T = 0.001
 Site =     2 /    16 .. Mmpo =    39 DW = 0.00e+00 NNZ =      267 SPT = 0.7147 Tmvc = 0.000 T = 0.001
 Site =     3 /    16 .. Mmpo =    42 DW = 0.00e+00 NNZ =      437 SPT = 0.7332 Tmvc = 0.000 T = 0.001
 Site =     4 /    16 .. Mmpo =    44 DW = 0.00e+00 NNZ =      345 SPT = 0.8133 Tmvc = 0.000 T = 0.001
 Site =     5 /    16 .. Mmpo =    46 DW = 0.00e+00 NNZ =      384 SPT = 0.8103 Tmvc = 0.000 T = 0.001
 Site =     6 /    16 .. Mmpo =    48 DW = 0.00e+00 NNZ =      411 SPT = 0.8139 Tmvc = 0.000 T = 0.001
 Site =     7 /    16 .. Mmpo =    50 DW = 0.00e+00 NNZ =      426 SPT = 0.8225 Tmvc = 0.000 T = 0.001
 Site =     8 /    16 .. Mmpo =    52 DW = 0.00e+00 NNZ =      429 SPT = 0.8350 Tmvc = 0.000 T = 0.001
 Site =     9 /    16 .. Mmpo =    54 DW = 0.00e+00 NNZ =      420 SPT = 0.8504 Tmvc = 0.000 T = 0.001
 Site =    10 /    16 .. Mmpo =    56 DW = 0.00e+00 NNZ =      399 SPT = 0.8681 Tmvc = 0.000 T = 0.001
 Site =    11 /    16 .. Mmpo =    54 DW = 0.00e+00 NNZ =      582 SPT = 0.8075 Tmvc = 0.000 T = 0.001
 Site =    12 /    16 .. Mmpo =    38 DW = 0.00e+00 NNZ =      450 SPT = 0.7807 Tmvc = 0.000 T = 0.001
 Site =    13 /    16 .. Mmpo =    20 DW = 0.00e+00 NNZ =       56 SPT = 0.9263 Tmvc = 0.000 T = 0.000
 Site =    14 /    16 .. Mmpo =     8 DW = 0.00e+00 NNZ =       26 SPT = 0.8375 Tmvc = 0.000 T = 0.000
 Site =    15 /    16 .. Mmpo =     1 DW = 0.00e+00 NNZ =        8 SPT = 0.0000 Tmvc = 0.000 T = 0.000
Ttotal =      0.012 Tmvc-total = 0.003 MPO bond dimension =    56 MaxDW = 0.00e+00
NNZ =         4681 SIZE =        25740 SPT = 0.8181

Environment initialization | Nsites =    16 | Center =     0
 INIT-R <-- Site =   13 ..  Bmem =    64 B Rmem =    64 B T = 0.01
 INIT-R <-- Site =   12 ..  Bmem =   272 B Rmem =   272 B T = 0.00
 INIT-R <-- Site =   11 ..  Bmem = 1.17 KB Rmem = 1.17 KB T = 0.00
 INIT-R <-- Site =   10 ..  Bmem = 5.14 KB Rmem = 5.14 KB T = 0.00
 INIT-R <-- Site =    9 ..  Bmem = 22.3 KB Rmem = 22.3 KB T = 0.00
 INIT-R <-- Site =    8 ..  Bmem = 95.9 KB Rmem = 95.9 KB T = 0.00
 INIT-R <-- Site =    7 ..  Bmem =  409 KB Rmem =  409 KB T = 0.00
 INIT-R <-- Site =    6 ..  Bmem = 1.69 MB Rmem = 1.00 MB T = 0.00
 INIT-R <-- Site =    5 ..  Bmem = 3.40 MB Rmem =  409 KB T = 0.00
 INIT-R <-- Site =    4 ..  Bmem = 1.34 MB Rmem = 95.9 KB T = 0.00
 INIT-R <-- Site =    3 ..  Bmem =  314 KB Rmem = 22.3 KB T = 0.00
 INIT-R <-- Site =    2 ..  Bmem = 70.4 KB Rmem = 5.14 KB T = 0.00
 INIT-R <-- Site =    1 ..  Bmem = 15.2 KB Rmem = 1.17 KB T = 0.00
 INIT-R <-- Site =    0 ..  Bmem = 3.09 KB Rmem =   272 B T = 0.00
Time init sweep =        0.034 | MaxBmem = 3.40 MB | MaxRmem =  409 KB
 | Tread = 0.000 | Twrite = 0.009 | Tfpread = 0.000 | Tfpwrite = 0.004 | data = 2.05 MB | cpsd = 1.54 MB | Tasync = 0.000

Environment initialization | Nsites =    16 | Center =     0
 INIT-R <-- Site =   13 ..  Bmem =    80 B Rmem =    80 B T = 0.00
 INIT-R <-- Site =   12 ..  Bmem =   616 B Rmem =   616 B T = 0.00
 INIT-R <-- Site =   11 ..  Bmem = 4.02 KB Rmem = 4.02 KB T = 0.00
 INIT-R <-- Site =   10 ..  Bmem = 19.4 KB Rmem = 19.4 KB T = 0.00
 INIT-R <-- Site =    9 ..  Bmem = 75.5 KB Rmem = 75.5 KB T = 0.00
 INIT-R <-- Site =    8 ..  Bmem =  278 KB Rmem =  278 KB T = 0.01
 INIT-R <-- Site =    7 ..  Bmem = 1.00 MB Rmem =  971 KB T = 0.01
 INIT-R <-- Site =    6 ..  Bmem = 3.50 MB Rmem = 2.04 MB T = 0.01
 INIT-R <-- Site =    5 ..  Bmem = 7.46 MB Rmem = 1.25 MB T = 0.00
 INIT-R <-- Site =    4 ..  Bmem = 4.62 MB Rmem =  394 KB T = 0.00
 INIT-R <-- Site =    3 ..  Bmem = 1.40 MB Rmem =  114 KB T = 0.00
 INIT-R <-- Site =    2 ..  Bmem =  408 KB Rmem = 31.9 KB T = 0.00
 INIT-R <-- Site =    1 ..  Bmem =  102 KB Rmem = 8.96 KB T = 0.00
 INIT-R <-- Site =    0 ..  Bmem = 18.2 KB Rmem = 1.01 KB T = 0.00
Time init sweep =        0.045 | MaxBmem = 7.46 MB | MaxRmem = 1.25 MB
 | Tread = 0.000 | Twrite = 0.015 | Tfpread = 0.000 | Tfpwrite = 0.007 | data = 5.14 MB | cpsd = 3.96 MB | Tasync = 0.000

Sweep =    0 | Direction =  forward | BRA bond dimension = 3000 | Noise =  1.00e-04 | Linear threshold =  1.00e-05
 --> Site =    0-   1 .. Mmps =    2 Nmult =    1 F =     5.0780600e-04 Error = 0.00e+00 FLOPS = 3.81e+05 Tmult = 0.00 T = 0.00
 --> Site =    1-   2 .. Mmps =    4 Nmult =    1 F =     8.4207738e-04 Error = 0.00e+00 FLOPS = 5.80e+06 Tmult = 0.00 T = 0.00
 --> Site =    2-   3 .. Mmps =    8 Nmult =    1 F =      0.0023838079 Error = 0.00e+00 FLOPS = 1.02e+08 Tmult = 0.00 T = 0.00
 --> Site =    3-   4 .. Mmps =   16 Nmult =    1 F =      0.0076145829 Error = 0.00e+00 FLOPS = 6.32e+08 Tmult = 0.00 T = 0.01
 --> Site =    4-   5 .. Mmps =   32 Nmult =    1 F =      0.0157304595 Error = 0.00e+00 FLOPS = 2.35e+09 Tmult = 0.00 T = 0.01
 --> Site =    5-   6 .. Mmps =   64 Nmult =    1 F =      0.0420352794 Error = 0.00e+00 FLOPS = 1.95e+09 Tmult = 0.01 T = 0.01
 --> Site =    6-   7 .. Mmps =  127 Nmult =    1 F =      0.0762589141 Error = 2.04e-15 FLOPS = 4.47e+09 Tmult = 0.01 T = 0.02
 --> Site =    7-   8 .. Mmps =  228 Nmult =    1 F =      0.2122567215 Error = 4.65e-14 FLOPS = 1.02e+10 Tmult = 0.00 T = 0.02
 --> Site =    8-   9 .. Mmps =  242 Nmult =    1 F =      0.2122567215 Error = 8.69e-14 FLOPS = 5.05e+10 Tmult = 0.00 T = 0.02
 --> Site =    9-  10 .. Mmps =  138 Nmult =    1 F =      0.2122567215 Error = 5.93e-14 FLOPS = 2.69e+10 Tmult = 0.00 T = 0.03
 --> Site =   10-  11 .. Mmps =   81 Nmult =    1 F =      0.2122567215 Error = 2.11e-14 FLOPS = 4.96e+09 Tmult = 0.00 T = 0.01
 --> Site =   11-  12 .. Mmps =   43 Nmult =    1 F =      0.2122567215 Error = 4.90e-15 FLOPS = 9.67e+08 Tmult = 0.00 T = 0.01
 --> Site =   12-  13 .. Mmps =   23 Nmult =    1 F =      0.2122567215 Error = 2.04e-19 FLOPS = 1.21e+08 Tmult = 0.00 T = 0.01
 --> Site =   13-  14 .. Mmps =   11 Nmult =    1 F =      0.2122567215 Error = 8.05e-19 FLOPS = 1.05e+07 Tmult = 0.00 T = 0.01
 --> Site =   14-  15 .. Mmps =    5 Nmult =    1 F =      0.2122567215 Error = 8.95e-18 FLOPS = 5.84e+05 Tmult = 0.00 T = 0.00
Time elapsed =      0.173 | F =   5.0780600e-04 | DW = 8.69e-14
Time sweep =        0.173 | 124 MFLOP/SWP
 | Dmem = 2.18 MB (6%) | Imem = 10.7 KB (93%) | Hmem = 0 B | Pmem = 252 KB
 | Tread = 0.042 | Twrite = 0.017 | Tfpread = 0.011 | Tfpwrite = 0.013 | data = 9.58 MB | cpsd = 7.28 MB | Tasync = 0.000
 | Trot = 0.002 | Tctr = 0.002 | Tint = 0.001 | Tmid = 0.000 | Tdctr = 0.000 | Tdiag = 0.000 | Tinfo = 0.001
 | Teff = 0.052 | Tprt = 0.011 | Tmult = 0.021 | Tblk = 0.126 | Tmve = 0.046 | Tdm = 0.001 | Tsplt = 0.006 | Tsvd = 0.000

Sweep =    1 | Direction = backward | BRA bond dimension = 3000 | Noise =  1.00e-04 | Linear threshold =  1.00e-05
 <-- Site =   14-  15 .. Mmps =    2 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 8.93e+05 Tmult = 0.00 T = 0.00
 <-- Site =   13-  14 .. Mmps =    4 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.00e+07 Tmult = 0.00 T = 0.00
 <-- Site =   12-  13 .. Mmps =    8 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.21e+08 Tmult = 0.00 T = 0.01
 <-- Site =   11-  12 .. Mmps =   16 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 8.73e+08 Tmult = 0.00 T = 0.03
 <-- Site =   10-  11 .. Mmps =   32 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 5.65e+09 Tmult = 0.00 T = 0.01
 <-- Site =    9-  10 .. Mmps =   64 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 2.80e+10 Tmult = 0.00 T = 0.01
 <-- Site =    8-   9 .. Mmps =  128 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 6.02e+10 Tmult = 0.00 T = 0.02
 <-- Site =    7-   8 .. Mmps =  250 Nmult =    1 F =      0.2122567215 Error = 2.26e-14 FLOPS = 4.49e+10 Tmult = 0.00 T = 0.02
 <-- Site =    6-   7 .. Mmps =  242 Nmult =    1 F =      0.2122567215 Error = 1.07e-13 FLOPS = 7.36e+09 Tmult = 0.00 T = 0.03
 <-- Site =    5-   6 .. Mmps =  137 Nmult =    1 F =      0.2122567215 Error = 5.85e-14 FLOPS = 1.67e+10 Tmult = 0.00 T = 0.01
 <-- Site =    4-   5 .. Mmps =   77 Nmult =    1 F =      0.2122567215 Error = 2.17e-14 FLOPS = 4.04e+09 Tmult = 0.00 T = 0.01
 <-- Site =    3-   4 .. Mmps =   44 Nmult =    1 F =      0.2122567215 Error = 9.95e-15 FLOPS = 8.57e+08 Tmult = 0.00 T = 0.01
 <-- Site =    2-   3 .. Mmps =   23 Nmult =    1 F =      0.2122567215 Error = 7.84e-18 FLOPS = 1.28e+08 Tmult = 0.00 T = 0.01
 <-- Site =    1-   2 .. Mmps =   11 Nmult =    1 F =      0.2122567215 Error = 5.37e-18 FLOPS = 1.30e+07 Tmult = 0.00 T = 0.00
 <-- Site =    0-   1 .. Mmps =    5 Nmult =    1 F =      0.2122567215 Error = 2.29e-18 FLOPS = 9.78e+05 Tmult = 0.00 T = 0.00
Time elapsed =      0.348 | F =    0.2122567215 | DF = 2.12e-01 | DW = 1.07e-13
Time sweep =        0.176 | 128 MFLOP/SWP
 | Dmem = 2.13 MB (6%) | Imem = 10.7 KB (93%) | Hmem = 0 B | Pmem = 252 KB
 | Tread = 0.036 | Twrite = 0.015 | Tfpread = 0.023 | Tfpwrite = 0.012 | data = 9.41 MB | cpsd = 7.01 MB | Tasync = 0.000
 | Trot = 0.002 | Tctr = 0.009 | Tint = 0.001 | Tmid = 0.000 | Tdctr = 0.000 | Tdiag = 0.000 | Tinfo = 0.001
 | Teff = 0.033 | Tprt = 0.011 | Tmult = 0.011 | Tblk = 0.094 | Tmve = 0.081 | Tdm = 0.001 | Tsplt = 0.006 | Tsvd = 0.000

Sweep =    2 | Direction =  forward | BRA bond dimension = 3000 | Noise =  1.00e-04 | Linear threshold =  1.00e-05
 --> Site =    0-   1 .. Mmps =    2 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 9.66e+05 Tmult = 0.00 T = 0.00
 --> Site =    1-   2 .. Mmps =    4 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.55e+07 Tmult = 0.00 T = 0.00
 --> Site =    2-   3 .. Mmps =    8 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.48e+08 Tmult = 0.00 T = 0.00
 --> Site =    3-   4 .. Mmps =   16 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 7.43e+08 Tmult = 0.00 T = 0.01
 --> Site =    4-   5 .. Mmps =   32 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 5.33e+09 Tmult = 0.00 T = 0.01
 --> Site =    5-   6 .. Mmps =   64 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.70e+10 Tmult = 0.00 T = 0.01
 --> Site =    6-   7 .. Mmps =  126 Nmult =    1 F =      0.2122567215 Error = 2.50e-15 FLOPS = 1.82e+10 Tmult = 0.00 T = 0.02
 --> Site =    7-   8 .. Mmps =  228 Nmult =    1 F =      0.2122567215 Error = 4.55e-14 FLOPS = 5.64e+10 Tmult = 0.00 T = 0.02
 --> Site =    8-   9 .. Mmps =  242 Nmult =    1 F =      0.2122567215 Error = 8.69e-14 FLOPS = 5.55e+10 Tmult = 0.00 T = 0.02
 --> Site =    9-  10 .. Mmps =  138 Nmult =    1 F =      0.2122567215 Error = 5.93e-14 FLOPS = 2.85e+10 Tmult = 0.00 T = 0.03
 --> Site =   10-  11 .. Mmps =   81 Nmult =    1 F =      0.2122567215 Error = 2.11e-14 FLOPS = 4.55e+09 Tmult = 0.00 T = 0.03
 --> Site =   11-  12 .. Mmps =   43 Nmult =    1 F =      0.2122567215 Error = 4.89e-15 FLOPS = 1.03e+09 Tmult = 0.00 T = 0.01
 --> Site =   12-  13 .. Mmps =   23 Nmult =    1 F =      0.2122567215 Error = 5.09e-20 FLOPS = 1.13e+08 Tmult = 0.00 T = 0.01
 --> Site =   13-  14 .. Mmps =   11 Nmult =    1 F =      0.2122567215 Error = 3.07e-19 FLOPS = 8.85e+06 Tmult = 0.00 T = 0.01
 --> Site =   14-  15 .. Mmps =    5 Nmult =    1 F =      0.2122567215 Error = 4.07e-23 FLOPS = 6.07e+05 Tmult = 0.00 T = 0.01
Time elapsed =      0.540 | F =    0.2122567215 | DF = 1.39e-16 | DW = 8.69e-14
Time sweep =        0.192 | 128 MFLOP/SWP
 | Dmem = 2.13 MB (6%) | Imem = 10.7 KB (93%) | Hmem = 0 B | Pmem = 252 KB
 | Tread = 0.036 | Twrite = 0.023 | Tfpread = 0.012 | Tfpwrite = 0.019 | data = 9.57 MB | cpsd = 7.26 MB | Tasync = 0.000
 | Trot = 0.002 | Tctr = 0.002 | Tint = 0.001 | Tmid = 0.000 | Tdctr = 0.000 | Tdiag = 0.000 | Tinfo = 0.001
 | Teff = 0.049 | Tprt = 0.008 | Tmult = 0.008 | Tblk = 0.122 | Tmve = 0.069 | Tdm = 0.001 | Tsplt = 0.006 | Tsvd = 0.000

Sweep =    3 | Direction = backward | BRA bond dimension = 3000 | Noise =  1.00e-04 | Linear threshold =  1.00e-05
 <-- Site =   14-  15 .. Mmps =    2 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 6.92e+05 Tmult = 0.00 T = 0.00
 <-- Site =   13-  14 .. Mmps =    4 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.05e+07 Tmult = 0.00 T = 0.00
 <-- Site =   12-  13 .. Mmps =    8 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.34e+08 Tmult = 0.00 T = 0.00
 <-- Site =   11-  12 .. Mmps =   16 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 9.82e+08 Tmult = 0.00 T = 0.01
 <-- Site =   10-  11 .. Mmps =   32 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 5.58e+09 Tmult = 0.00 T = 0.01
 <-- Site =    9-  10 .. Mmps =   64 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 3.07e+10 Tmult = 0.00 T = 0.01
 <-- Site =    8-   9 .. Mmps =  128 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 6.93e+10 Tmult = 0.00 T = 0.01
 <-- Site =    7-   8 .. Mmps =  250 Nmult =    1 F =      0.2122567215 Error = 2.26e-14 FLOPS = 5.56e+10 Tmult = 0.00 T = 0.02
 <-- Site =    6-   7 .. Mmps =  242 Nmult =    1 F =      0.2122567215 Error = 1.07e-13 FLOPS = 3.62e+10 Tmult = 0.00 T = 0.02
 <-- Site =    5-   6 .. Mmps =  137 Nmult =    1 F =      0.2122567215 Error = 5.85e-14 FLOPS = 2.71e+10 Tmult = 0.00 T = 0.02
 <-- Site =    4-   5 .. Mmps =   77 Nmult =    1 F =      0.2122567215 Error = 2.17e-14 FLOPS = 4.73e+09 Tmult = 0.00 T = 0.01
 <-- Site =    3-   4 .. Mmps =   44 Nmult =    1 F =      0.2122567215 Error = 9.95e-15 FLOPS = 8.55e+08 Tmult = 0.00 T = 0.01
 <-- Site =    2-   3 .. Mmps =   23 Nmult =    1 F =      0.2122567215 Error = 2.97e-18 FLOPS = 1.35e+08 Tmult = 0.00 T = 0.01
 <-- Site =    1-   2 .. Mmps =   11 Nmult =    1 F =      0.2122567215 Error = 2.82e-18 FLOPS = 1.36e+07 Tmult = 0.00 T = 0.00
 <-- Site =    0-   1 .. Mmps =    5 Nmult =    1 F =      0.2122567215 Error = 1.29e-17 FLOPS = 8.48e+05 Tmult = 0.00 T = 0.01
Time elapsed =      0.691 | F =    0.2122567215 | DF = -5.00e-16 | DW = 1.07e-13
Time sweep =        0.151 | 128 MFLOP/SWP
 | Dmem = 2.13 MB (6%) | Imem = 10.7 KB (93%) | Hmem = 0 B | Pmem = 252 KB
 | Tread = 0.013 | Twrite = 0.018 | Tfpread = 0.010 | Tfpwrite = 0.015 | data = 9.41 MB | cpsd = 7.01 MB | Tasync = 0.000
 | Trot = 0.002 | Tctr = 0.002 | Tint = 0.001 | Tmid = 0.000 | Tdctr = 0.000 | Tdiag = 0.000 | Tinfo = 0.001
 | Teff = 0.026 | Tprt = 0.008 | Tmult = 0.007 | Tblk = 0.092 | Tmve = 0.058 | Tdm = 0.001 | Tsplt = 0.006 | Tsvd = 0.000

Sweep =    0 | Direction =  forward | BRA bond dimension = 3000 | Noise =  1.00e-04 | Linear threshold =  1.00e-05
 --> Site =    0 .. Mmps =    2 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.37e+05 Tmult = 0.00 T = 0.00
 --> Site =    1 .. Mmps =    4 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.81e+06 Tmult = 0.00 T = 0.00
 --> Site =    2 .. Mmps =    8 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.77e+07 Tmult = 0.00 T = 0.00
 --> Site =    3 .. Mmps =   16 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.27e+08 Tmult = 0.00 T = 0.00
 --> Site =    4 .. Mmps =   32 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 6.69e+08 Tmult = 0.00 T = 0.01
 --> Site =    5 .. Mmps =   64 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 3.56e+09 Tmult = 0.00 T = 0.01
 --> Site =    6 .. Mmps =  126 Nmult =    1 F =      0.2122567215 Error = 1.48e-15 FLOPS = 1.97e+10 Tmult = 0.00 T = 0.02
 --> Site =    7 .. Mmps =  228 Nmult =    1 F =      0.2122567215 Error = 4.56e-14 FLOPS = 4.70e+10 Tmult = 0.00 T = 0.02
 --> Site =    8 .. Mmps =  242 Nmult =    1 F =      0.2122567215 Error = 8.69e-14 FLOPS = 5.16e+10 Tmult = 0.00 T = 0.03
 --> Site =    9 .. Mmps =  138 Nmult =    1 F =      0.2122567215 Error = 5.93e-14 FLOPS = 2.52e+10 Tmult = 0.00 T = 0.03
 --> Site =   10 .. Mmps =   81 Nmult =    1 F =      0.2122567215 Error = 2.11e-14 FLOPS = 5.91e+09 Tmult = 0.00 T = 0.02
 --> Site =   11 .. Mmps =   43 Nmult =    1 F =      0.2122567215 Error = 4.89e-15 FLOPS = 1.03e+09 Tmult = 0.00 T = 0.01
 --> Site =   12 .. Mmps =   23 Nmult =    1 F =      0.2122567215 Error = 2.32e-18 FLOPS = 1.17e+08 Tmult = 0.00 T = 0.01
 --> Site =   13 .. Mmps =   11 Nmult =    1 F =      0.2122567215 Error = 4.90e-19 FLOPS = 1.06e+07 Tmult = 0.00 T = 0.00
 --> Site =   14 .. Mmps =    5 Nmult =    1 F =      0.2122567215 Error = 2.06e-18 FLOPS = 5.80e+05 Tmult = 0.00 T = 0.00
 --> Site =   15 .. Mmps =    2 Nmult =    1 F =      0.2122567215 Error = 8.85e-18 FLOPS = 7.18e+04 Tmult = 0.00 T = 0.00
Time elapsed =      0.186 | F =    0.2122567215 | DW = 8.69e-14
Time sweep =        0.186 | 90.6 MFLOP/SWP
 | Dmem = 1.99 MB (0%) | Imem = 6.09 KB (89%) | Hmem = 0 B | Pmem = 248 KB
 | Tread = 0.057 | Twrite = 0.016 | Tfpread = 0.012 | Tfpwrite = 0.013 | data = 9.58 MB | cpsd = 7.27 MB | Tasync = 0.000
 | Trot = 0.002 | Tctr = 0.002 | Tint = 0.001 | Tmid = 0.000 | Tdctr = 0.000 | Tdiag = 0.000 | Tinfo = 0.001
 | Teff = 0.062 | Tprt = 0.008 | Tmult = 0.007 | Tblk = 0.126 | Tmve = 0.057 | Tdm = 0.001 | Tsplt = 0.006 | Tsvd = 0.000

Sweep =    1 | Direction = backward | BRA bond dimension = 3000 | Noise =  1.00e-04 | Linear threshold =  1.00e-05
 <-- Site =   15 .. Mmps =    2 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 6.78e+04 Tmult = 0.00 T = 0.00
 <-- Site =   14 .. Mmps =    4 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 5.95e+05 Tmult = 0.00 T = 0.00
 <-- Site =   13 .. Mmps =    8 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.17e+07 Tmult = 0.00 T = 0.00
 <-- Site =   12 .. Mmps =   16 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.25e+08 Tmult = 0.00 T = 0.01
 <-- Site =   11 .. Mmps =   32 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.10e+09 Tmult = 0.00 T = 0.01
 <-- Site =   10 .. Mmps =   64 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 6.37e+09 Tmult = 0.00 T = 0.01
 <-- Site =    9 .. Mmps =  128 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 3.10e+10 Tmult = 0.00 T = 0.01
 <-- Site =    8 .. Mmps =  252 Nmult =    1 F =      0.2122567215 Error = 8.57e-15 FLOPS = 5.79e+10 Tmult = 0.00 T = 0.03
 <-- Site =    7 .. Mmps =  251 Nmult =    1 F =      0.2122567215 Error = 1.01e-13 FLOPS = 5.39e+10 Tmult = 0.00 T = 0.01
 <-- Site =    6 .. Mmps =  143 Nmult =    1 F =      0.2122567215 Error = 7.17e-14 FLOPS = 2.20e+10 Tmult = 0.00 T = 0.02
 <-- Site =    5 .. Mmps =   79 Nmult =    1 F =      0.2122567215 Error = 3.40e-14 FLOPS = 3.49e+09 Tmult = 0.00 T = 0.02
 <-- Site =    4 .. Mmps =   45 Nmult =    1 F =      0.2122567215 Error = 3.08e-16 FLOPS = 6.35e+08 Tmult = 0.00 T = 0.01
 <-- Site =    3 .. Mmps =   23 Nmult =    1 F =      0.2122567215 Error = 2.13e-18 FLOPS = 1.12e+08 Tmult = 0.00 T = 0.01
 <-- Site =    2 .. Mmps =   11 Nmult =    1 F =      0.2122567215 Error = 1.46e-19 FLOPS = 1.60e+07 Tmult = 0.00 T = 0.01
 <-- Site =    1 .. Mmps =    5 Nmult =    1 F =      0.2122567215 Error = 9.26e-18 FLOPS = 2.03e+06 Tmult = 0.00 T = 0.00
 <-- Site =    0 .. Mmps =    2 Nmult =    1 F =      0.2122567215 Error = 0.00e+00 FLOPS = 1.40e+05 Tmult = 0.00 T = 0.00
Time elapsed =      0.333 | F =    0.2122567215 | DF = 8.33e-16 | DW = 1.01e-13
Time sweep =        0.147 | 91.4 MFLOP/SWP
 | Dmem = 2.00 MB (0%) | Imem = 6.09 KB (89%) | Hmem = 0 B | Pmem = 250 KB
 | Tread = 0.034 | Twrite = 0.018 | Tfpread = 0.010 | Tfpwrite = 0.014 | data = 9.72 MB | cpsd = 7.24 MB | Tasync = 0.000
 | Trot = 0.002 | Tctr = 0.002 | Tint = 0.001 | Tmid = 0.000 | Tdctr = 0.000 | Tdiag = 0.000 | Tinfo = 0.001
 | Teff = 0.038 | Tprt = 0.008 | Tmult = 0.007 | Tblk = 0.094 | Tmve = 0.053 | Tdm = 0.001 | Tsplt = 0.006 | Tsvd = 0.000

ATTENTION: Linear is not converged to desired tolerance of 1.000e-07
Environment initialization | Nsites =    16 | Center =     0
 INIT-R <-- Site =   14 ..  Bmem =   184 B Rmem =   184 B T = 0.00
 INIT-R <-- Site =   13 ..  Bmem =   880 B Rmem =   880 B T = 0.00
 INIT-R <-- Site =   12 ..  Bmem = 4.16 KB Rmem = 4.16 KB T = 0.00
 INIT-R <-- Site =   11 ..  Bmem = 20.3 KB Rmem = 20.3 KB T = 0.00
 INIT-R <-- Site =   10 ..  Bmem = 98.5 KB Rmem = 98.5 KB T = 0.00
 INIT-R <-- Site =    9 ..  Bmem =  474 KB Rmem =  474 KB T = 0.00
 INIT-R <-- Site =    8 ..  Bmem = 2.21 MB Rmem = 5.25 MB T = 0.01
 INIT-R <-- Site =    7 ..  Bmem = 10.4 MB Rmem = 10.1 MB T = 0.02
 INIT-R <-- Site =    6 ..  Bmem = 31.0 MB Rmem = 8.48 MB T = 0.02
 INIT-R <-- Site =    5 ..  Bmem = 25.3 MB Rmem = 2.35 MB T = 0.00
 INIT-R <-- Site =    4 ..  Bmem = 6.77 MB Rmem =  609 KB T = 0.00
 INIT-R <-- Site =    3 ..  Bmem = 1.64 MB Rmem =  161 KB T = 0.00
 INIT-R <-- Site =    2 ..  Bmem =  424 KB Rmem = 33.9 KB T = 0.00
 INIT-R <-- Site =    1 ..  Bmem = 80.3 KB Rmem = 6.62 KB T = 0.00
 INIT-R <-- Site =    0 ..  Bmem = 11.9 KB Rmem = 1.16 KB T = 0.00
Time init sweep =        0.065 | MaxBmem = 31.0 MB | MaxRmem = 8.48 MB
 | Tread = 0.000 | Twrite = 0.042 | Tfpread = 0.000 | Tfpwrite = 0.040 | data = 27.8 MB | cpsd = 20.9 MB | Tasync = 0.000

Environment initialization | Nsites =    16 | Center =     0
hczhai commented 6 months ago

Thanks for providing the detailed input and output. The problem is now fixed in https://github.com/block-hczhai/block2-preview/commit/2777af4aa8799d06ec2624230e2ed3f84621cd3a.

You can update block2 using

python3 -m pip uninstall block2
python3 -m pip install block2==0.5.3rc10 --extra-index-url=https://block-hczhai.github.io/block2-preview/pypi/
alarese commented 6 months ago

That seemed to fix this issue, my scripts are now running. Thank you for your help