Closed nmayorov closed 6 years ago
Thanks for the suggestions Nikolay. I will modify my code accordingly.
Just made a few changes accordingly to your comments. A few notes about the modifications:
x = x + alpha*p
is actually better than x += alpha*p
because of the similarity with:
x_next = x + alpha*p
. Also on line 365 I kept z = z - A.T.dot(v)
instead
of z -= A.T.dot(v)
because I think it is better.I accepted your suggestion on this except for a few places (e.g. line 551) where I think the notation
That is an interesting view. I don't really agree with it, because there is no mathematical equivalent to x = x + alpha*p
it is an algorithmic thing, which is precisely x += alpha*p
and to me it looks cleaner. Additionally you usually annotate code lines with a comment. And the final argument --- I guess no one is used to see code like this, and if someone sees it, he can get a bad impression on the code as the whole (which would be totally wrong).
But if you like it that way --- I don't insist on changing. I understand your arguments and I can see that it is more readable that way.
I noticed that you put your latest code in the same file. Maybe it's already the right situation for splitting?
By the way, I answered in the wiki.
I agree with you Nikolay. It is starting to getting to big to just one file. I will think about a good way to split the code.
This is quite subjective, but I find the code a bit too sparse vertically, which hurst readability in my opinion. What I mean is that it's hard to grasp logical indented blocks, because there are too much blank lines, it falls apart sort of.
The similar concept is discussed here. I think in your case such sparseness already started to hurt readability.
Make sense to me. I will find some time latter to remove some extra blank spaces.
Just read byrd_omojokun_sqp.py
, some notes:
equality_constrained_sqp
(too verbose?) and reference the algorithm's authors in the docstring. I guess it is not necessary to change, but looks a bit strange nevertheless.fun
, fun_grad
, constr
, constr_jac
and then get rid of k
suffixes everywhere. My main motivation is that k
suffixes are quite useless (and even somewhat confusing), except differentiating data variable names from callable names, so maybe it's better to get rid of them. Additionally callable names are usually longer, so it makes sense to differentiate them from simple variables by longer names here as well.iteration
instead of k
.ared
and pred
are bad names (despite being used in the book), I think actual_reduction
, predicted_reduction
are much better.1/10
is better written as 0.1
, etc. trust_radiusk = 1/10*trust_radiusk -> trust_radiusk *= 0.1
would be good to see. I make an argument that while it doesn't look similar to trust_radius_new = 0.1 * trust_radius
, it is essentially good, because it clearly differentiates if from this other case. Reading trust_radiusk = 1/10*trust_radiusk
one may wonder --- is this a bug or what?SUFFICIENT_REDUCTION
is defined, but other threshold values are not. In smaller functions like default_stop_criteria
named constants are indeed not necessary I guess.tr_
prefix means intr_lb
and tr_ub
?reduction_ratio = ared/pred -> reduction_ratio = ared / pred
, etc.I do agree with most of your notes. I will do the required modifications.
About item 7, tr_
stands for trust-region. I would like to indicate tr_lb
and tr_ub
as the upper and lower bound for the trust-region (and not for the problem). In other words,
the problem have the general problem being solved is:
minimize f(x)
subject to: c(x) = 0
with no bound constraints (at least for now).
But the subproblem being solve is:
minimize f + g.T x + 1/2 x.T H x
subject to: A x + b = 0
|| inv(S) x || < trust_radius
tr_lb <= inv(S) x <= tr_ub
I must clarify the role of S
in the docstring as well (I will do it in the next commit)
Just implemented your suggestions Nikolay. I think it was a great improvement on the code overall quality. Thanks :)
It looks like you're using the function scipy.sparse.linalg.splu
in projections.py
. I was just digging into scipy.sparse.linalg
today and it looks like scipy.sparse.linalg.factorized
would be better. It uses umfpack
if available and falls back to scipy.sparse.linalg.splu
otherwise. Since you have SuiteSparse, would you get the umfpack
scikit if you don't already have it and try factorized
? The consensus online seems to be that it is typically faster.
It makes sense. I will change it...
Thanks for noticing this!
Read through ipsolver
:
opt_problems
and which is really nice in my opinion:
fun
, grad
, hess
, lagr_hess
.constr_eq
, jac_eq
, hess_eq
.constr_ineq
, jac_ineq
, hess_ineq
.mu
. After all there are plenty of such math-like variables (like s
for slack variables).x.shape
looks nicer then np.shape(x)
?[]
in your "vector notation" in docstrings.if
returning True
and False
. Better to it rewrite as a single return?barrier_parameter
and tolerance
probably should be updated with *=
.Overall I like how easy and concise it looks.
One thing: it seems to me that you always lean towards sparse matrices / LinearOperators. Doesn't it contradict with that we wanted to allow solving dense problems as well? If you feel like that this is unnatural (or too much trouble) for this solver, then perhaps we should "officially" drop this option?
opt_problems
.barrier_parameter
. Mostly because the notation regarding it is not uniform among the references. I am trying to use mathematical notation when the symbol is well established (For instance, the slack variables are called s
in all the references). About sparse/dense matrices.
The Hessian can be a sparse matrix, a dense matrix or a LinearOperator, and my implementation is dealing with each of them in the best possible way.
The major decision we have to face is how to deal with the Jacobian matrix. So far the implementation converts it to an sparse matrix always.
I don't know yet how to proceed about this. In presence of inequality nonlinear constraints (or upper and lower bounds in the variables) the Jacobian matrix will have a lot of zeros and the sparse representation is actually appropriate, even if a dense structure is passed by the user.
I don't think that is a good alternative to use a dense structure every time
dense matrices are provided (since sparse structures can arise nevertheless). The only case when it is obvious better to use a dense structure, is when only equality constraints are present. For other cases this is not obvious at all. Maybe the best alternative is to provide an dense/sparse Jacobian
option.
opt_problems
.barrier_parameter
. Mostly because the notation regarding it is not uniform among the references. I am trying to use mathematical notation when the symbol is well established (For instance, the slack variables are called s
in all the references). About sparse/dense matrices:
The Hessian can be a sparse matrix, a dense matrix or a LinearOperator. The implementation is dealing with each of them in the best possible way.
The major decision we have to face is how to deal with the Jacobian matrix. So far the implementation converts it to an sparse matrix always.
I don't know yet how to proceed about this. In presence of inequality nonlinear constraints (or upper and lower bounds in the variables) the Jacobian matrix will have a lot of zeros and the sparse representation is actually appropriate, even if a dense structure is passed by the user.
I don't think that is a good alternative to use a dense structure every time
dense matrices are provided (since sparse structures can arise nevertheless). The only case when it is obviously better to use a dense structure, is when only equality constraints are present. For other cases this is not obvious at all. Maybe the best alternative is to provide an dense_sparse_jacobian
option.
I was reading your code from time to time and some things drew my attention. I decided to create a separate issue for that.
Describing arrays as "unidimensional". I suggest an alternative approach: mention array shapes in the type annotation (as usually done), previously introducing the notation (like n, m, etc.) in the docstring body (and often it's OK to even omit it and introduce names right in the shape description). As for the description --- think of something more specific and precise. Example:
I understand that it looks like overkill somewhat, but I think it is a more intelligent way to handle things in this situation.
The same goes for return arrays: writing their shape will make it easier to understand their relation to the problem formulation.
This is rather pedantic: the PEP convention prescribes to use one-line summary docstrings and then go into more details if necessary. To my excuse, I was asked to make such fixes by charris back then, for me it was OK and I even managed somehow to achieve that. Even worse, in theory it should start on the same line as the triple quotes.
From the same category: this story about "comment with a period" vs "comment without a period". If you have any specific reasons not to use periods --- don't bother.
Sometimes you didn't use
+=
(etc) operators when possible, it looks a bit odd to my eye. Not sure about efficiency by the way, it could make very small difference because of temporary memory allocation.Brackets around
not
operands in if are usually unnecessary.Another preferential thing: I don't like using
len
applied to numpy arrays, I would personally useshape
with the required index. For me it is more clearly separates numpy structures from pure Python structures and it is more explicitly.