Open carterbox opened 7 years ago
During initial work on the polytope
package, the API was intended to follow that of MPT (the Multi-Parametric Toolbox for Matlab). In that Matlab toolbox, the bounding_box
method of the Polytope
class stores the bounding box as a matrix of size N \times 2.
That said, I agree with changing the return type as you propose. I think that the main disadvantage is an extra transpose (or otherwise, change to the shape of) to the corner vertices. However, that cost is small compared to benefits of using transformations in your case.
The question then is whether making the proposed change is going to disrupt anyone's use of polytope
. Personally, I don't make any calls to bounding_box
from outside polytope
, but bounding_box
is a public member of the API. Changing what it returns would be a major version increase.
The method bounding_box
of the Region
(or Polytope
) class is being used in tulip
, in the following modules of the subpackage tulip.abstract
:
discretization
(plotting)plot
(plotting)prop2partition
(used for refinement by a grid)In the past, in Matlab code (e.g., numerical_utils
and plot_utils
), I used to store each set of points as a matrix where each column is a point (or vector). This allowed for vectorized manipulation. The points in such a matrix can be rotated by multiplying from the left with a rotation matrix. Also, vertical arrays are (I think) more common in the literature for representing vectors and point coordinates. The current 2-tuple is of vertical arrays.
The proposed change is to return a 2 x N
(numpy
?) array, where each row is a point. What about an N x 2
numpy
array?
OK, an Nx2 ndarray
seems to be a better choice because of precedence in the literature for vectors to be represented as column vectors, better compatibility with existing code in tulip
, and trivial work for me to change what I've already written.
I could switch over all the internal calls to bounding_box
for polytope v2.0.0
?
Regarding version management, I increment the minor version number for incompatible API changes. The version up to a037b555758ed9ee736fa7cb324d300b8d622fb4 has been below 1.0.0, so this practice happens to be compatible with semantic versioning. I do not follow semantic versioning strictly, I agree with several of the principles it describes.
If applied after version 0.1.4 is released, this change would result in an increment to version 0.2.0. I think that 1.0.0 should not be used before the API is overhauled, testing coverage increased to more than 80%, documentation written, and some algorithms reimplemented.
I agree with the comment above about requirements before v1.0.0. So, likely the next version (pending this issue) will be 0.2.0.
regarding https://github.com/tulip-control/polytope/issues/34#issuecomment-271961616, @carterbox have you already modified bounding_box
to return N \times 2 arrays, or are you proposing that someone else makes the change (to master
or dev
branch) and then you rebase your work on it?
Yeah, brain fart this morning, I forgot that 0.x.y is pre-release and APIs are considered unstable.
I have not already modified bounding_box
or all of the other calls to it. There isn't much difference between the two implementations in terms of length. It just means transforming the limits separately instead of at the same time.
I'm offering to make all of the changes, but I am uncertain whether it would make any significant impact on performance.
To be sure that I understand where we are on this issue:
Here, code quality would mean easier to check and understand matrix computations vs. tuples of vectors.
I agree, with the note that the next version as of 4d541aecbc856e28ea56175d1175661807619a0b will be polytope == 0.1.4
, so that the bug fixes become available to tulip == 1.3.0
. As implied by a remark above, the API change introduced in this issue will require releasing tulip == 1.3.1
.
In the discussion above, we decided that for multiple vectors in one matrix, column vectors are the preference. However, for a lonesome vector, e.g. chebXc
, is it preferable to have two dimensions when only one is needed shape (N,)
vs shape (N,1)
? Because numpy
doesn't treat empty dimensions as singleton dimensions, this these two choices do not behave the same.
The problem is that, there is an inconsistent definition of vector type things. Travis tests failing for #39 and the changes at 770a408 are related to this topic.
For the question of the opening post and with some relevance to the question about lonesome vectors, I am also in support of using matrices of shape (2, N)
.
For lonesome vectors, I do not think that we can give a general rule because vectors can be expressed both in the type numpy.ndarray
and the type numpy.matrix
. It might be worth reviewing the polytope
code and ensuring that it uniformly uses numpy.matrix
and column vectors (so, having shape (N,1)
).
In a bounding box matrix of shape (2, N)
, the coordinates of each corner form a row. In the literature, we usually transform points represented as matrices of shape (N, 1)
by multiplying from the left with a suitable N x N
matrix. The shape (2, N)
would require multiplying from the right with the transpose of the same matrix. Also, if column matrices are already used for vectors elsewhere in polytope
, then the two point representations won't match. The existing interface of bounding_box
represents points as vertical matrices.
I just did some reading about the differences between choosing numpy.ndarray
vs numpy.matrix
, and from what I read, it seems better that we consistently return numpy.ndarray
in shape (N,?)
or (N, )
.
The reasons cited seem to be:
numpy.ndarray
than numpy.matrix
for small array sizes.numpy.ndarray
.@
instead of numpy.dot
for matrix multiplication with numpy.ndarray
.Additionally, everywhere in polytope
we are already using numpy.ndarray
.
re https://github.com/tulip-control/polytope/issues/34#issuecomment-275895416, I do not think that precedence in the literature is a sufficient argument because in a practical implementation, we must also be concerned with issues of numerical computation, such as speed and precision. However, the fact that previously bounding_box
used column matrices is indeed motivation to not change.
re https://github.com/tulip-control/polytope/issues/34#issuecomment-275896190, to be clear, I think that the type to which you refer is numpy.ndarray
, not numpy.array
. I did not find information about speed of operations in the reference that you (@carterbox) provided. In any case, I think that we can validate the argument by profiling polytope
itself.
Another interesting consideration about performance is the configuration in which the arrays are stored. The default of numpy.ndarray
is C order, also known as row-major, i.e., where the right-most index changes most quickly when traversing elements. Thus, we might guess that it is better to use numpy.ndarray
of shape (N,)
for vectors.
To be clear, I did not intend to entirely dismiss the argument about following the notation in the literature. Indeed, following it makes understanding easier.
Why is bounding box stored as a 2-tuple of vertical arrays?
For the purpose of applying transformations to the bounding box (#32), I think it would be better to store the bounding box as a single 2xN array where the min corner and max corner are their own rows.