jcmgray / quimb

A python library for quantum information and many-body calculations including tensor networks.
http://quimb.readthedocs.io
Other
488 stars 108 forks source link

Strange interaction between quimb and xyzpy #56

Open adamcallison opened 4 years ago

adamcallison commented 4 years ago

I've come across some strange behaviour when using quimb and xyzpy together. Not sure which library would be best to put the issue on, but I've chosen this one. I've not been able to replicate the exact bug in a minimal example, but in doing so I've discovered another one (which may or may not be the same thing manifesting differently.

The minimal example is (note line that imports xyzpy is commented):

#import xyzpy as xyz
import quimb as qu
import numpy as np

Z = qu.pauli('z',2,sparse=True, dtype=float)
X = qu.pauli('x',2,sparse=True, dtype=float)
Identity = qu.pauli('i', 2, sparse=True, dtype=float)
def transverse(n):
    gHw = qu.ikron(n*Identity, dims=(2,)*n, inds=(0,), sparse=True)
    for i in range(n):
        gHw -= qu.ikron(X, dims=(2,)*n, inds=(i,), sparse=True)
    return gHw

mat11 = qu.qu(-1.0j*(transverse(11)), stype='csr')
mat12 = qu.qu(-1.0j*(transverse(12)), stype='csr')
vec11 = np.ones(2**11)/np.sqrt(2**11)
vec12 = np.ones(2**12)/np.sqrt(2**12)

print('Everything set up....')
mat11 @ vec11
print('dim 2**11 succeeded')
mat12 @ vec12
print('dim 2**12 succeeded')

If put this in a file test.py and run with python test.py, I get the following, as expected:

Everything set up....
dim 2**11 succeeded
dim 2**12 succeeded

However, if I uncomment the import xyzpy as xyz line (even without actually using it anywhere), I get a warning and segfault

/home/adam/anaconda3/envs/core/lib/python3.7/site-packages/xyzpy/utils.py:321: NumbaDeprecationWarning: The 'numba.jitclass' decorator has moved to 'numba.experimental.jitclass' to better reflect the experimental nature of the functionality. Please update your imports to accommodate this change and see http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#change-of-jitclass-location for the time frame.
  ('M2', double),
Everything set up....
dim 2**11 succeeded
Segmentation fault (core dumped)

More mysteriously, if I change the stype in

mat11 = qu.qu(-1.0j*(transverse(11)), stype='csr')
mat12 = qu.qu(-1.0j*(transverse(12)), stype='csr')

to

mat11 = qu.qu(-1.0j*(transverse(11)), stype='csc')
mat12 = qu.qu(-1.0j*(transverse(12)), stype='csc')

and run, I get the warning but no segfault:

/home/adam/anaconda3/envs/core/lib/python3.7/site-packages/xyzpy/utils.py:321: NumbaDeprecationWarning: The 'numba.jitclass' decorator has moved to 'numba.experimental.jitclass' to better reflect the experimental nature of the functionality. Please update your imports to accommodate this change and see http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#change-of-jitclass-location for the time frame.
  ('M2', double),
Everything set up....
dim 2**11 succeeded
dim 2**12 succeeded

After some digging (and inspired by output of the original bug), it seems to have something to do with quimb dispatching sparse dot products to parallel methods, via numba, for matrices with more than 50000 nonzero elements (which is true for the 2**12 case but not the 2**11 case) where available. I can suppress this behaviour by adding the following lines to the top of the test.py.

import os
os.environ['QUIMB_NUM_THREAD_WORKERS'] = '1'

which stops the crash (but not the warning). I don't know where to begin with fixing this.

I am using python version 3.7.6, quimb version '1.2.0+142.g40e5a2a' (current state of develop branch), xyzpy version ''0.3.1+28.gd5afe9b'' (current state of develop branch), and my numba version is 0.49.0.

jcmgray commented 4 years ago

So not sure quite what to do here (other than update the jitclass import as per warning) - the code runs without crashing on my windows laptop and also ubuntu desktop - both with numba=0.49.

Numba does seem to have these kind of bugs, particularly in relation to parallel. Another thing to try is to turn off the numba caching - which also seems to interact badly with the parallel stuff.

Try:

export QUIMB_NUMBA_CACHE=off

I've come across these numba segfaults before and they didn't happen deterministically so were particularly hard to identify minimal reproducer for.