markovmodel / PyEMMA

🚂 Python API for Emma's Markov Model Algorithms 🚂
http://pyemma.org
GNU Lesser General Public License v3.0
311 stars 119 forks source link

Koopman ignores dim setting #1365

Open euhruska opened 6 years ago

euhruska commented 6 years ago

koopman ignores the number of dimensions requested, and later gives fatal error. Here koopman method of tica generates 3 dimensions, but I have set dim=5 for the tica. tica_obj calculation works, but get_output fails due to dimension mismatch.

Traceback (most recent call last):
  File "run-tica-msm3.py", line 712, in <module>
    Runticamsm().run()
  File "run-tica-msm3.py", line 154, in run
    y = tica_obj.get_output(stride=tica_stride)
  File "/mnt/b/projects/sciteam/bamm/hruska/vpy8/lib/python3.5/site-packages/pyemma/coordinates/data/_base/transformer.py", line 227, in get_output
    return super(StreamingTransformer, self).get_output(dimensions, stride, skip, chunk)
  File "/mnt/b/projects/sciteam/bamm/hruska/vpy8/lib/python3.5/site-packages/pyemma/coordinates/data/_base/datasource.py", line 407, in get_output
    trajs[itraj][i, :] = chunk[:, dimensions]
ValueError: could not broadcast input array from shape (100,3) into shape (100,5)

This issue is a copy of https://github.com/radical-collaboration/extasy-grlsd/issues/84 @marscher

The comand to get tica_obj: tica_obj = pyemma.coordinates.tica(get_out_arr, lag=25, dim=5, kinetic_map=True, stride=1, weights='koopman') python 3.5.5, pyemma 2.5.4

marscher commented 6 years ago

can you dump this object and upload is somewhere?

marscher commented 6 years ago

tica_obj.save('tica_debug_koopman.pyemma')

euhruska commented 6 years ago

Here https://drive.google.com/file/d/1hYZoE72H2C-2ahiAzOLcOzHXfRtER8TQ/view?usp=sharing

marscher commented 6 years ago

On 9/24/18 11:40 PM, Eugen Hruska wrote:

Here https://drive.google.com/file/d/1hYZoE72H2C-2ahiAzOLcOzHXfRtER8TQ/view?usp=sharing

Thanks. The eigenvectors have indeed only a dimension of 4. This means however that solving the generalized eigenvalue problem only yielded 4 dimensions given the default epsilon of 1e-6.

There is a bug in the tica base class dimension method, which returns the input parameter, where it instead is intended to return the real output dimension after diagonalization.

I will provide a fix soon.

euhruska commented 6 years ago

any update?

marscher commented 6 years ago

On 10/12/18 1:11 AM, Eugen Hruska wrote:

any update?

the bug occurs because the matrix real rank is lower than the requested dimension, this should of course not happen. The proposed fix involved a lot of refactoring (eg. to do it future-proof) and is not ready now. A simple workaround would be to "correct" the dimension to be lower or equal the real rank after the estimation, prior transforming.