Closed orbeckst closed 5 years ago
@VOD555 @kain88-de @richardjgowers suggestions welcome; see PR #88...
@orbeckst
vstack
cannot deal with array whose size is [1251, 1250, 1250, 1250]
. It works for array with shape [(1251, 1), (1250, 1), (1250, 1), (1250, 1)]
, or [(1251, n), (1250, n), (1250, n), (1250, n)]
.
For example change the return value to
[ag.universe.trajectory.time, ag.radius_of_gyration()]
Although hstack
works for this case, it doesn't work for array with size [(1251, n), (1250, n), (1250, n), (1250, n)]
.
I suggest we still use vstack
, and add some preprocess to solve this problem.
For example, reshape _result before _conclude()
if len(np.shape(self._results[0])) == 1:
n = len(self.results[0])
results = [a.reshape((n, 1)) for a in self.results]
If we do the reshape(ni, 1)
(where ni
is the number of frames in each block and t
is the total number of frames, i.e., the sum of all the ni
) then the output array will be shape (t, 1)
, which is really weird when you thought you'd just get a series. In this case we should then reshape back to (t, )
.
Perhaps there's a less convoluted way to achieve the following for n
frames:
_single_frame()
is a single scalar, the output result should be shape (t,)
m
entries then the output result should be shape (t, m)
.I am not sure what should happen if _single_frame()
returns, e.g., a numpy array of shape (3, 4)
. Should the result be an array of objects with shape (n,)
or a numpy array of shape (t, 3, 4)
?
EDIT: made the notation clearer by introducing ni
and t
(as opposed to using n
for everything)
I think the result should be a numpy array of shape (t, 3, 4)
. And this is what vstack
does now.
EDIT: n
-> t
so that logic between comments makes sense
Ok, doing "what vstack()
does" is a fairly consistent approach. I'd still want a scalar return to generate a simple 1D series, though.
Maybe np.concatenate() does what we want by default?
In [1]: import numpy as np
In [2]: a1 = np.arange(10)
In [3]: a2 = np.arange(10)
In [4]: a3 = np.arange(11)
In [5]: np.concatenate([a1, a2, a3])
Out[5]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6,
7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
In [6]: b1 = np.arange(10).reshape(5,2)
In [7]: b2 = np.arange(10).reshape(5,2)
In [8]: b3 = np.arange(12).reshape(6,2)
In [9]: np.concatenate([b1, b2, b3])
Out[9]:
array([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11]])
Also works for vectors:
In [10]: b1 = np.arange(30).reshape(5,2,3)
In [11]: b3 = np.arange(36).reshape(6,2,3)
In [12]: np.concatenate([b1, b1, b3])
Out[12]:
array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29]],
[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29]],
[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29]],
[[30, 31, 32],
[33, 34, 35]]])
In [13]: np.concatenate([b1, b1, b3]).shape
Out[13]: (16, 2, 3)
Expected behaviour
I can just follow the typical example to make analysis from a single frame function such as
Actual behaviour
The above fails with
Code to reproduce the behaviour
See above.
The problem is that the arrays in
_results
do not have the same lengthsand
np.vstack()
does not like this. Instead one would neednp.hstack()
. However, as soon as the return value of the function is a tuple,vstack()
works as expected.Currently version of MDAnalysis: