Open rowanc1 opened 6 years ago
Here are a couple thoughts:
nP
--> n_parameters
(size is confusing in this context, the regularization or data misfit takes in something that is nP
and returns something that is nC
- there can be mappings under the hood)W
is a matrix. I think weights
would be confusing at there are cell_weights
(vector) in the regularization. I would be fine breaking pep8 here... or we could do something along the lines of w_matrix
or norm_matrix
jacobain
or hessian
with a pde (hessian
less-so, but to me, jacobian
immediately makes me think of a deriv of a PDE). Personally, I vote for deriv
and deriv2
, also because the namespace for deriv_sigma
, deriv_mu
, etc seems a bit simpler (but I realize that I am biased, because this is what we are using everywhere in SimPEG :) ) Don't all of the objective functions get us down to a float?! m.T W.T W m
return value is always a float. The main thing that we care about is the size of the incoming vector. In this sense it is the total size (unraveled) not the shape
that we care about.
Agree on the weights
--> could have both. weights
and weights_matrix
--> and error if you try to set both.
Duck typing --> care less about how it looks (assert isinstance), and more about how it works/quacks (multiply works?).
I like deriv/deriv2 --> a lot of time we aren't actually doing the full hessian anyways, and are dropping the non-PD terms.
Yeah, sorry. There is another internal size step that does matter, and that is the size of the mapped m (in the more general case, mapping(m).T * W.T * W * mapping(m)
). Still, the shape
is more like (,nP)
. I do find size
ambiguous when thinking from a matrix operation perspective (is it the size of the incoming thing or the output). But maybe that is just me.
What do you envision weights
doing? (eg. are you suggesting we rename cell_weights
--> weights
)
re: duck typing: Ah, ok sure, that works
The shape
of an objective function is (1, 1)
(which isn't helpful), however, it takes a vector of m.size == nP
where m.shape == (nP, 1) || (nP, )
(likely called with mkvc(m)
to make it the (nP, )
). So in this case the size
is the most unambiguous check on the vector (m.size == obj.size
is what I am proposing). The internal mappings should have a shape
of which shape[1] == nP
- I think this requires an update to the mappings specifically which we can talk about brining them more inline with scipy linear operators.
I think you raise a good point that the mappings should be in the base objective function (they aren't currently). And that the objective should also have a shape that gives it a sense of how the internals work - which would be the shape of the mapping - or by default (nP, nP)
. Maybe size is the wrong term to use, but I do like the analogies to numpy and m.size == obj.size
.
I see the weights
being the diagonal of weights_matrix
, with weights_matrix
taking priority if it is set, or erroring if both are set. cells
isn't quite correct generally, I don't think, as it could be nodes
or data
or ... parameters
. We could also coerce the input of the weights matrix (maybe just called matrix
?) to take either a vector (the diagonal) or a matrix - making it unambiguous that those to variables are connected - and can only be set one way.
I guess it depends how you define shape... I tend to think of shape
as the shape of the operation (nP --> 1
as we do for mappings) rather than the shape
of the output. Then, thinking about what size the derivative should be comes quite naturally. I still find obj.size
confusing. The objective function is really a function that takes from a space of size n_parameters
and brings it to a space of size 1
... What is the size of the function?? To me, checking m.size == obj.n_parameters
would align more closely with the treatment of the objective function as a mathematical function that takes a vector of size n_parameters
and gives you back something that is size 1
.
:+1: on moving mappings (or a mapping type object) to the base, I agree that it belongs there and that we should bring mappings more in line with scipy linear operators.
Okay, I like the suggestion of weights
being the diagonal and weights_matrix
being the matrix. @fourndo, you probably have the most experience in the regularization - do you think this would be appropriate?
Yeah, got shape wrong above: (1, nP).
In summary of things we agree on (I think!):
obj.shape == (1, nP)
obj.mapping.shape == (nC, nP)
obj.weights.size == nC # setting would override `weights_matrix = sdiag(weights)`
obj.weights_matrix.shape == (nW, nC) # usually square? not if it is a gradient etc
So in the function space:
nP -(mapping)-> nC -(matrix)-> nW -(matrix.T)-> nC -(mapping.T)-> nP -(dot)-> 1
or in linear algebra:
m.T * mapping.T * W.T * W * mapping * m
With brackets.
You had mapping(m)
above?
My hangup on size is from numpy's definitions. In numpy size
is defined as:
Equivalent to
np.prod(a.shape)
, i.e., the product of the array’s dimensions.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.size.html#numpy.ndarray.size
I guess I also don't like the number of characters in n_parameters
or in nP
and ideally would like it lowercase! size
seemed to fit the bill. We could also skip it and just use obj.shape[1]
in checks. I don't think this has too much use externally anyways?
Is it confusing if you can set either weights
or weights_matrix
? Is the serialization of this annoying as well? Can we just have a single variable with coercion? i.e. just weights_matrix
?
:+1: on the agreed upon things :). The mapping(m)
was a bit of a simplification in my head, I meant mapping * m
and the transpose then needs to be (mapping * m).T
(mapping.T
won't work, but conceptually, its fine).
With respect to size
, it can work. It is a bit odd that objfct.size != objfct(m).size
- I don't think that should cause a problem, but don't want to confuse users. In this case, we have that objfct.size == objfct.deriv(m).size
.
In the Simple
regularization within SimPEG, the cell weights will be on the diagonal of the smallness component (https://github.com/simpeg/simpeg/blob/master/SimPEG/Regularization.py#L858), and is averaged and multiplied on the outside of the derivative operator for the SimpleSmoothDeriv
(https://github.com/simpeg/simpeg/blob/master/SimPEG/Regularization.py#L932). If you update the cell weights (as in @thast's work I believe??) , we then need to recreate these matrices based on the changed cell weights. Similarly, in @fourndo's work with the sparse regularization, the W
is recreated at every iteration as some of the internal parameters (eg. gamma
, R
) change at each iteration, but the weights
remain fixed. (https://github.com/simpeg/simpeg/blob/master/SimPEG/Regularization.py#L1357)
With respect to the weights, I think the implementation will remain simpler if we have both the weights
and the weights_matrix
and serialize the weights
. Serializing the weights_matrix
in the Sparse case is not actually all that meaningful, what you would want is the weights
and tuning parameters
Just a minor note:
objfct(m).size
should error, because it is a float not an array.
We can make a decision at tomorrows meeting around naming.
Weights: is there any time that there are off diagonals that would need to be serialized? I will have to look into the regularizations in simpeg a bit more to get my head around the use cases. Thanks for the links!
nP
-->size
?deriv
/2 --> gradient? hessian? Not sure on this