jp43 / LSDMap

Package to perform Locally-Scaled Diffusion Map
Other
2 stars 3 forks source link

scipy.sparse.linalg.eigen.arpack.arpack.ArpackError: ARPACK error -3: NCV must be greater than NEV and less than or equal to N. #5

Open dotsdl opened 8 years ago

dotsdl commented 8 years ago

I'm running LSDmap-directed MD using 8 configurations, which I realize is quite small compared to what the typical example uses (1000). However, lsdmap appears to have trouble with this. The content of lsdmap.log gives how far it got:

INFO:root:02:39:16: intializing LSDMap with 4 processors...
INFO:root:02:39:17: input coordinates loaded
WARNING:root:02:39:17: .w file does not exist, set weights to 1.0.
INFO:root:02:39:17: LSDMap initialized
INFO:root:02:39:17: distance matrix computed

with the traceback given on STDERR:


Inactive Modules:
  1) python/2.7.9

The following have been reloaded with a version change:
  1) intel/15.0.2 => intel/14.0.1.106  2) mvapich2/2.1 => mvapich2/2.0b

Activating Modules:
  1) python/2.7.6

Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
  File "/home1/02142/dldotson/.local/bin/lsdmap", line 5, in <module>
  File "/home1/02142/dldotson/.local/bin/lsdmap", line 5, in <module>
  File "/home1/02142/dldotson/.local/bin/lsdmap", line 5, in <module>
  File "/home1/02142/dldotson/.local/bin/lsdmap", line 5, in <module>
    lsdm.LSDMap().run()
    lsdm.LSDMap().run()
  File "/home1/02142/dldotson/.local/lib/python2.7/site-packages/lsdmap/lsdm.py", line 394, in run
  File "/home1/02142/dldotson/.local/lib/python2.7/site-packages/lsdmap/lsdm.py", line 394, in run
    lsdm.LSDMap().run()
  File "/home1/02142/dldotson/.local/lib/python2.7/site-packages/lsdmap/lsdm.py", line 394, in run
    lsdm.LSDMap().run()
  File "/home1/02142/dldotson/.local/lib/python2.7/site-packages/lsdmap/lsdm.py", line 394, in run
            params.iterate()
params.iterate()
params.iterate()
    params.iterate()
  File "/home1/02142/dldotson/.local/lib/python2.7/site-packages/lsdmap/mpi/p_arpack.py", line 62, in iterate
  File "/home1/02142/dldotson/.local/lib/python2.7/site-packages/lsdmap/mpi/p_arpack.py", line 62, in iterate
  File "/home1/02142/dldotson/.local/lib/python2.7/site-packages/lsdmap/mpi/p_arpack.py", line 62, in iterate
  File "/home1/02142/dldotson/.local/lib/python2.7/site-packages/lsdmap/mpi/p_arpack.py", line 62, in iterate
    raise ArpackError(self.info, infodict=self.iterate_infodict)
    raise ArpackError(self.info, infodict=self.iterate_infodict)
    raise ArpackError(self.info, infodict=self.iterate_infodict)
scipy.sparse.linalg.eigen.arpack.arpackscipy.sparse.linalg.eigen.arpack.arpack.ArpackError.ArpackError: : ARPACK error -3: NCV must be greater than NEV and less than or equal to N.ARPACK error -3: NCV must be greater than NEV and less than or equal to N.

    raise ArpackError(self.info, infodict=self.iterate_infodict)
scipy.sparse.linalg.eigen.arpack.arpack.ArpackError: ARPACK error -3: NCV must be greater than NEV and less than or equal to N.
scipy.sparse.linalg.eigen.arpack.arpack.ArpackError: ARPACK error -3: NCV must be greater than NEV and less than or equal to N.
[c559-202.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 3, pid: 21704) exited with status 1
[c559-202.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 0, pid: 21701) exited with status 1
[c559-202.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 1, pid: 21702) exited with status 1
[c559-202.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 2, pid: 21703) exited with status 1

I'm not familiar enough with what exactly is going wrong here, but perhaps its a convergence issue? If it's related to the number of configurations used as I suspect, what is a rule of thumb for the minimum number to use? My simulation systems are about 140,000 atoms in total, so I can't afford to do anywhere close to 1000 separate simulations, but could go a few times higher than 8.

dotsdl commented 8 years ago

An additional note: each of the 8 runs only ran for 500ps, so there isn't much difference between the resulting configurations. I'm running a longer test now where each run goes for 5ns before the 8 configurations are pushed through lsdmap, and we'll see if I still get the same errors.

TensorDuck commented 8 years ago

Two possibilities when using a very small number of configurations comes to mind:

  1. Num threads for LSDMap > Num configurations. But it looks like you're using 4 threads, so that's not likely the issue (unless I'm reading your output wrong).
  2. Your local scale is wrong. Using a constant local scale that's too large will cause you to have a matrix of all zeros. A safe automatic way would be using kneighbor_mean which will still use a constant scale, but automatically sets a reasonable value.

Let me know if those things work or not.