Closed juliohm closed 6 years ago
I agree that ⟨u,v⟩, ⟨u|v⟩, or (u,v) are more common notations for inner products on arbitrary Hilbert spaces—they are what I typically use myself—but those notations are a nonstarter for Julia.
It would actually be possible to make ⟨u,v⟩
work.
@StefanKarpinski: It would actually be possible to make ⟨u,v⟩ work.
Absolutely, and supporting this precise notation was suggested in #8934, but it never went anywhere. (Note also that angle brackets have other common uses, e.g. ⟨u⟩ often denotes an average of some kind.) It is non-breaking and could still be added at some point, but it doesn't seem reasonable to expect in the near term. It's also quite slow to type \langle<tab> x, y \rangle<tab>
, so it's not very convenient from a programming standpoint for an elementary operation.
and we can't overload <> for it, right?
No
Can't say I've read every comment on this humongous thread, but I just want to highlight a few points, some of which have been made before:
dot
being conjugated by default is the best thing ever. No decision point here, I just wanted to say how glad I am that I no longer have to do these conj(dot())
s!norm
: if you're coding up an optimization algorithm and want to stop whenever norm(delta x) < eps
, you're going to write norm
. But then you want to optimize wrt an image or something, you run your code, and it suddenly launches into an unkillable (because BLAS) SVD of a big array. This is not academic, it has caused trouble in Optim.jl, and doubtless in other packages as well. Nobody is going to know that vecnorm
exists unless they have a specific reason of looking for it.dot
and vecdot
, and norm
and vecnorm
is good, even if it removes a bit of flexibility in array-of-arrays cases. For norms, I'd add that often when working with things on which there's multiple norms defined (eg matrices), what the user wants is to call norm
to get a norm, without particularly caring which one. Induced norms are most often of theoretical rather than practical interest due to their computational intractability. They are also specific to the 2D-array-as-operator interpretation rather than the 2D-array-as-storage one (an image is a 2D array, but it's not an operator in any useful sense). It's good to have the possibility of computing them, but they don't deserve to be the default norm
. Reasonable, simple and well-documented defaults that have discovereable alternatives are better than attempted cleverness (if the user wants to do a clever thing, let them do it explicitly).Therefore, +1 on @stevengj 's
yes, I think if we make this change then we only need dot and norm (which are now the recursive Euclidean operations on arrays or arrays-of-arrays of any dimensionality, though for norm we also define norm(x,p) to be the p-norm) and opnorm, and no longer have vecdot or vecnorm.
A more "julian" alternative to norm/opnorm might be to have an Operator type, which could wrap a 2D array, on which norm
does opnorm. This can be done at the level of packages (several of which already exist)
I'd much rather type opnorm(matrix)
than norm(Operator(matrix))
…
I'm going to chime in from the peanut gallery here and say that I like where this is going—vecnorm
and vecdot
have always bothered me. Needing to explicitly ask for the operator norm—which has always seemed fairly specialized to me—seems much saner than having to ask for a norm that's much faster and easier to compute (e.g. the Frobenius norm). Writing opnorm
seems like a fine interface for asking for the relatively specialized operator norm.
I also feel that having a subtle distinction between dot
and inner
is likely to lead to confusion and rampant misuse. Lecturing users about which function they're supposed to use when both functions do what they want and one of them is easier tends not to work out very well. My impression is that it's relatively rare in generic code that sum(x*y for (x,y) in zip(u,v))
is actually what you want when a true inner product ⟨u,v⟩
actually exists. When that's really what's wanted, it's fairly easy, clear and efficient (because Julia is what it is) to just write something like that to compute it.
Whether to call the u⋅v
function dot
or inner
seems like the least consequential part of all of this. I'm fairly sure that neither choice would be looked back upon by historians as a disaster—although the notion that historians would care at all certainly is flattering. On the one hand, if we agree to keep the "true inner product" meaning of u⋅v
then yes, inner
is the more correct mathematical term. On the other hand, when there is a syntax with a corresponding function name it tends to confuse users less when the name matches the syntax. Since the syntax here uses a dot, that rule of thumb supports spellings this operation as dot
. Perhaps this might be a reasonable case to define const dot = inner
and export both? Then people can use or extend whichever name they prefer since they're the same thing. If someone wants to use either name for something else they can, and the other name will remain available with its default meaning. Of course that would make three exported names for the same function—dot
, inner
and ⋅
—which seems a bit excessive.
Is it an option to remove the ⋅
symbol or replace it with <u,v>
?
Comments:
<u,v> * M * x
vs.
u ⋅ v * M * x
The <u,v>
syntax imply association: first we operate on u
and v
and then the rest of the expression follows.
If a user made the effort to type <u,v>
, it is very unlikely that he had in mind a simple sum(x[i]*y[i])
. The symbol ⋅
is easy to skip with the eyes, and has many other connotations. Particularly, in linear algebra, for a vector space V over a field F, the product of a scalar α ∈ F
with a vector v ∈ V
is denoted α ⋅ v
in various textbooks.
Removing or replacing the ⋅
would also eliminate the issue of multiple names being exported. One would have to only export inner
and <,>
for general inner products, with the default implementation for arrays matching the iterable summation semantics.
If one needs to define a scalar-by-vector product like described above for a vector space V over a field F, he/she would be able to define ⋅
notation for it. The vector space would then be fully defined with nice short syntax, and could be extended to a Hilbert space by further defining <u,v>
.
We definitely cannot use the syntax <u,v>
; the syntax we could use is ⟨u,v⟩
—note the Unicode brackets, not the less than and greater than signs, <
and >
. We also have u'v
as a syntax for something that is either a dot product or an inner product? (I'm not sure which...)
Yes, sorry, the unicode version of it. It would be very clear to read. It would also solve this issue with multiple names, and free ⋅
for other purposes.
I don't think we want to use ⋅
for any other purpose—that seems like it would be confusing.
Just imagining how wonderful would it be to be able to write code that looks like:
⟨α ⋅ u, v⟩ + ⟨β ⋅ w, z⟩
for abstract vectors (or types) u,v,w,z ∈ V
and scalars α, β ∈ F
.
u'v
is an inner product (and a dot product, if you follow the conjugated convention) only for 1d arrays, not for e.g. matrices. (This is another reason why it is pointless to limit infix dot to 1d arrays, since we already have a terse notation for that case.)
Stefan, “correct mathematical term” is a category error—mathematical correctness is not a concept that applies to terminology/notation. (Substitute “conventional” for “correct.” But then the concern becomes less urgent,)
More use cases: https://stackoverflow.com/questions/50408177/julia-calculate-an-inner-product-using-boolean-algebra
And a formal derivation of boolean inner products using the ⟨,⟩
notation: https://arxiv.org/abs/0902.1290
EDIT: fixed link to paper
What do you think of the angle brackets syntax proposal? Would it solve the issues raised here?
So what's your proposal exactly? Is it roughly this:
dot
to inner
u⋅v
to ⟨u,v⟩
So then there would be no dot
function and no ⋅
operator?
Would that change be something reasonable?
Sorry for the delay to reply, I am at a conference with limited access to internet.
And just for clarity and completeness, what is the counterproposal here? Do nothing?
To clarify the proposal even further, there is semantic change involved: generalized inner products.
Heads up: we've now debated this to the point where there's a real risk it won't make it into 0.7-alpha. That doesn't mean it can't be changed after the alpha, but there will be a lot more reluctance to change things after that.
Yes, I wish I had the skills to have submitted a PR a long time ago. It is beyond my abilities to make it happen, even though I find it to be extremely important feature.
Even discounting the operator syntax question, there's still some complexity with name shadowing and multi-stage deprecations for each set of semantic concepts (current dot
and vecdot
and current norm
and vecnorm
).
For the dot
side it seems like the whole space of options (again discounting operators) is:
I. Silently break dot
on vectors of arrays by changing the behavior in 0.7 to an inner product without a standard depwarn (although you can warn that the behavior is being changed). Deprecate vecdot
to dot
in 0.7 as well.
II. In 0.7, deprecate vecdot
on all inputs to inner
.
III. In 0.7, deprecate dot
on vectors of arrays to its definition, and dot
on other inputs and vecdot
on all inputs to inner
.
IV. In 0.7, deprecate both dot
and vecdot
on vectors of arrays either to unexported functions or to their definitions, and vecdot
on all other inputs to dot
. In 1.0, add dot
on vectors of arrays with inner product semantics.
For the norm side there's some consensus around a single path (in 0.7, deprecate norm
on matrices to opnorm
and possibly deprecate vecnorm
to innernorm
; in 1.0, add norm
on matrices with current vecnorm
semantics), but that also results in an extra name in 1.0 (either vecnorm
or innernorm
); again a way to avoid that could be to deprecate vecnorm
in 0.7 to its definition or to an unexported function like Base.vecnorm
rather than to an exported name.
...I think. Hope I didn't make things mushier than they already were.
Can anyone familiar with the codebase submit a PR for the change?
Can we split off the norm stuff that everyone seems to agree on and get that done at least? The dot
versus inner
bit is quite a bit more controversial, but let's not let that stymie the part that's not.
@StefanKarpinski, note that they are somewhat coupled: for types where you have both a dot (inner) product and a norm, they should be consistent.
Ok, I don't really care which way this goes. Whoever does the work gets to decide.
I had a PR ( #25093 ) to make vecdot
behave as true inner product (and vecnorm
as the corresponding norm), by making them recursive. This might be useful as a starting point of how the future dot
and norm
should look like. Unfortunately, my lack of git skills made me screw up that PR so I closed it, planning to return to it after the new iteration syntax was completed.
However, having just become father for the second time a few days ago means that there are currently no "free time" slots in my calendar.
having just become father for the second time a few days ago
Congratulations Jutho! 🎉
Yes, congrats!
It seems like there might be some consensus forming around the idea of having both dot
and inner
, where:
inner
is a true recursive inner productdot = dot(x,y) = sum(x[i]'*y[i] for i = 1:length(x))
conjugated or not, and would therefore overlap with dot
for Vector{<:Number}
or Vector{<:Real}
Regarding:
Many people would write dot when they meant inner, and it would work fine for them in most cases, but then their code would do something unexpected if it is passed an array of matrices.
I don't believe that would be an issue. Since this is a fairly uncommon operation, I would expect people to at the very least try it to see what it does and/or go look at the documentation.
I think that the above would be a great change, and not very disruptive since the semantics of dot
are not changed in most cases.
It seems like there might be some consensus forming around the idea of having both
dot
andinner
To the contrary, the discussion from https://github.com/JuliaLang/julia/issues/25565#issuecomment-390069503 on appears to favor having one or the other but not both, as e.g. laid out in https://github.com/JuliaLang/julia/issues/25565#issuecomment-390388230 and well supported with reactions.
Maybe inner
(and also dot
) should be recursive inner/dot/scalar products and the old behaviour could be implemented in functions such as dotc(x,y) = sum(x[i]' * y[i] for i in eachindex(x))
and dotu(x,y) = sum(transpose(x[i]) * y[i] for i in eachindex(x))
? The names dotu
and dotc
would match corresponding BLAS names.
(I agree that ⟨u,v⟩, ⟨u|v⟩, or (u,v) are more common notations for inner products on arbitrary Hilbert spaces—they are what I typically use myself—but those notations are a nonstarter for Julia. There was some discussion of parsing Unicode brackets as function/macro calls, e.g. #8934 and #8892, but it never went anywhere and this seems unlikely to change soon.)
@stevengj, when you added this paragraph to a previous comment by yourself, you meant that the syntax ⟨u,v⟩
is hard to implement in the language?
Any chance this feature will make it to Julia v1.0? I have so many ideas and packages that depend on the notion of general inner products. Please let me know if I should lower my expectations. Sorry for the constant reminder.
Have you not seen #27401?
Thank you @jebej and thank you @ranocha for taking the lead :heart:
when you added this paragraph to a previous comment by yourself, you meant that the syntax ⟨u,v⟩ is hard to implement in the language?
Not technically hard to add to the parser, but it has proven difficult to come up with a consensus on how (and whether) to represent custom brackets in the language. See the discussion at #8934, which went nowhere 4 years ago and has not been revived since. (Add that to the fact that in different fields people use the same brackets for many different things, e.g. ⟨u⟩ is used for ensemble averages in statistical physics.) Another issue, raised in #8892, is the visual similarity of many different Unicode brackets.
Thank you @stevengj, I appreciate the clarifications. I am already very excited that we are gonna have general inner products standardized across packages. :100: Maybe the angle bracket notation could shine in another release cycle in the future. Not as important, but quite convenient to be able to write code that is literally like the math in our publications.
If ⟨args...⟩
is a valid syntax for calling the anglebrackets
operator or something (what to call the function this syntax calls is actually kind of tricky since we don't have any precedent), then people could opt into whatever meaning they want for the syntax.
@StefanKarpinski, the argument in #8934 was that it should be a macro. I don't think we ever came to a consensus.
(If we decide in Base that anglebrackets(a,b)
means inner(a,b)
, that will discourage people from "opting into whatever meaning they want" because the decision will have already been made.
It's not a terrible choice, of course, but it may be unnecessary to assign this a meaning in Base as long as they are parsed.)
I don't recall the details of that discussion but making a macro seems like obviously a bad idea to me.
With #27401 I think we can consider inner products to have been taking seriously.
Traditionally, an issue is closed only when the relevant PR is merged...
Sure, we can leave it open I guess. Just wanted to get it off the triage label.
Should this be closed since #27401 is merged now?
If a user needs to define custom inner products for general Hilbert spaces, what is the current state in Base for this type of generalization? https://github.com/JuliaLang/julia/issues/16573 is a related, but less general issue. My concern is with new types that aren't arrays.
I'd like to propose renaming
dot
toinner
, or perhaps directing users to defineinner(x,y)
as a general inner product between objectsx
andy
, including the array case:In case the change is reasonable, could it be part of Julia v1.0?