JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.69k stars 5.48k forks source link

State of inner products in Base #25565

Closed juliohm closed 6 years ago

juliohm commented 6 years ago

If a user needs to define custom inner products for general Hilbert spaces, what is the current state in Base for this type of generalization? https://github.com/JuliaLang/julia/issues/16573 is a related, but less general issue. My concern is with new types that aren't arrays.

I'd like to propose renaming dot to inner, or perhaps directing users to define inner(x,y) as a general inner product between objects x and y, including the array case:

inner(x::AbstractVector, y::AbstractVector) = dot(x, y)

In case the change is reasonable, could it be part of Julia v1.0?

StefanKarpinski commented 6 years ago

I agree that ⟨u,v⟩, ⟨u|v⟩, or (u,v) are more common notations for inner products on arbitrary Hilbert spaces—they are what I typically use myself—but those notations are a nonstarter for Julia.

It would actually be possible to make ⟨u,v⟩ work.

stevengj commented 6 years ago

@StefanKarpinski: It would actually be possible to make ⟨u,v⟩ work.

Absolutely, and supporting this precise notation was suggested in #8934, but it never went anywhere. (Note also that angle brackets have other common uses, e.g. ⟨u⟩ often denotes an average of some kind.) It is non-breaking and could still be added at some point, but it doesn't seem reasonable to expect in the near term. It's also quite slow to type \langle<tab> x, y \rangle<tab>, so it's not very convenient from a programming standpoint for an elementary operation.

oscardssmith commented 6 years ago

and we can't overload <> for it, right?

ararslan commented 6 years ago

No

antoine-levitt commented 6 years ago

Can't say I've read every comment on this humongous thread, but I just want to highlight a few points, some of which have been made before:

Therefore, +1 on @stevengj 's

yes, I think if we make this change then we only need dot and norm (which are now the recursive Euclidean operations on arrays or arrays-of-arrays of any dimensionality, though for norm we also define norm(x,p) to be the p-norm) and opnorm, and no longer have vecdot or vecnorm.

A more "julian" alternative to norm/opnorm might be to have an Operator type, which could wrap a 2D array, on which norm does opnorm. This can be done at the level of packages (several of which already exist)

stevengj commented 6 years ago

I'd much rather type opnorm(matrix) than norm(Operator(matrix))

StefanKarpinski commented 6 years ago

I'm going to chime in from the peanut gallery here and say that I like where this is going—vecnorm and vecdot have always bothered me. Needing to explicitly ask for the operator norm—which has always seemed fairly specialized to me—seems much saner than having to ask for a norm that's much faster and easier to compute (e.g. the Frobenius norm). Writing opnorm seems like a fine interface for asking for the relatively specialized operator norm.

I also feel that having a subtle distinction between dot and inner is likely to lead to confusion and rampant misuse. Lecturing users about which function they're supposed to use when both functions do what they want and one of them is easier tends not to work out very well. My impression is that it's relatively rare in generic code that sum(x*y for (x,y) in zip(u,v)) is actually what you want when a true inner product ⟨u,v⟩ actually exists. When that's really what's wanted, it's fairly easy, clear and efficient (because Julia is what it is) to just write something like that to compute it.

Whether to call the u⋅v function dot or inner seems like the least consequential part of all of this. I'm fairly sure that neither choice would be looked back upon by historians as a disaster—although the notion that historians would care at all certainly is flattering. On the one hand, if we agree to keep the "true inner product" meaning of u⋅v then yes, inner is the more correct mathematical term. On the other hand, when there is a syntax with a corresponding function name it tends to confuse users less when the name matches the syntax. Since the syntax here uses a dot, that rule of thumb supports spellings this operation as dot. Perhaps this might be a reasonable case to define const dot = inner and export both? Then people can use or extend whichever name they prefer since they're the same thing. If someone wants to use either name for something else they can, and the other name will remain available with its default meaning. Of course that would make three exported names for the same function—dot, inner and —which seems a bit excessive.

juliohm commented 6 years ago

Is it an option to remove the symbol or replace it with <u,v>?

Comments:

<u,v> * M * x

vs.

u ⋅ v * M * x
StefanKarpinski commented 6 years ago

We definitely cannot use the syntax <u,v>; the syntax we could use is ⟨u,v⟩—note the Unicode brackets, not the less than and greater than signs, < and >. We also have u'v as a syntax for something that is either a dot product or an inner product? (I'm not sure which...)

juliohm commented 6 years ago

Yes, sorry, the unicode version of it. It would be very clear to read. It would also solve this issue with multiple names, and free for other purposes.

StefanKarpinski commented 6 years ago

I don't think we want to use for any other purpose—that seems like it would be confusing.

juliohm commented 6 years ago

Just imagining how wonderful would it be to be able to write code that looks like:

⟨α ⋅ u, v⟩ + ⟨β ⋅ w, z⟩

for abstract vectors (or types) u,v,w,z ∈ V and scalars α, β ∈ F.

stevengj commented 6 years ago

u'v is an inner product (and a dot product, if you follow the conjugated convention) only for 1d arrays, not for e.g. matrices. (This is another reason why it is pointless to limit infix dot to 1d arrays, since we already have a terse notation for that case.)

stevengj commented 6 years ago

Stefan, “correct mathematical term” is a category error—mathematical correctness is not a concept that applies to terminology/notation. (Substitute “conventional” for “correct.” But then the concern becomes less urgent,)

juliohm commented 6 years ago

More use cases: https://stackoverflow.com/questions/50408177/julia-calculate-an-inner-product-using-boolean-algebra

And a formal derivation of boolean inner products using the ⟨,⟩ notation: https://arxiv.org/abs/0902.1290

EDIT: fixed link to paper

juliohm commented 6 years ago

What do you think of the angle brackets syntax proposal? Would it solve the issues raised here?

StefanKarpinski commented 6 years ago

So what's your proposal exactly? Is it roughly this:

  1. Deprecate dot to inner
  2. Deprecate u⋅v to ⟨u,v⟩

So then there would be no dot function and no operator?

juliohm commented 6 years ago

Would that change be something reasonable?

Sorry for the delay to reply, I am at a conference with limited access to internet.

StefanKarpinski commented 6 years ago

And just for clarity and completeness, what is the counterproposal here? Do nothing?

juliohm commented 6 years ago

To clarify the proposal even further, there is semantic change involved: generalized inner products.

StefanKarpinski commented 6 years ago

Heads up: we've now debated this to the point where there's a real risk it won't make it into 0.7-alpha. That doesn't mean it can't be changed after the alpha, but there will be a lot more reluctance to change things after that.

juliohm commented 6 years ago

Yes, I wish I had the skills to have submitted a PR a long time ago. It is beyond my abilities to make it happen, even though I find it to be extremely important feature.

jekbradbury commented 6 years ago

Even discounting the operator syntax question, there's still some complexity with name shadowing and multi-stage deprecations for each set of semantic concepts (current dot and vecdot and current norm and vecnorm).

For the dot side it seems like the whole space of options (again discounting operators) is:

I. Silently break dot on vectors of arrays by changing the behavior in 0.7 to an inner product without a standard depwarn (although you can warn that the behavior is being changed). Deprecate vecdot to dot in 0.7 as well. II. In 0.7, deprecate vecdot on all inputs to inner. III. In 0.7, deprecate dot on vectors of arrays to its definition, and dot on other inputs and vecdot on all inputs to inner. IV. In 0.7, deprecate both dot and vecdot on vectors of arrays either to unexported functions or to their definitions, and vecdot on all other inputs to dot. In 1.0, add dot on vectors of arrays with inner product semantics.

For the norm side there's some consensus around a single path (in 0.7, deprecate norm on matrices to opnorm and possibly deprecate vecnorm to innernorm; in 1.0, add norm on matrices with current vecnorm semantics), but that also results in an extra name in 1.0 (either vecnorm or innernorm); again a way to avoid that could be to deprecate vecnorm in 0.7 to its definition or to an unexported function like Base.vecnorm rather than to an exported name.

...I think. Hope I didn't make things mushier than they already were.

juliohm commented 6 years ago

Can anyone familiar with the codebase submit a PR for the change?

StefanKarpinski commented 6 years ago

Can we split off the norm stuff that everyone seems to agree on and get that done at least? The dot versus inner bit is quite a bit more controversial, but let's not let that stymie the part that's not.

stevengj commented 6 years ago

@StefanKarpinski, note that they are somewhat coupled: for types where you have both a dot (inner) product and a norm, they should be consistent.

StefanKarpinski commented 6 years ago

Ok, I don't really care which way this goes. Whoever does the work gets to decide.

Jutho commented 6 years ago

I had a PR ( #25093 ) to make vecdot behave as true inner product (and vecnorm as the corresponding norm), by making them recursive. This might be useful as a starting point of how the future dot and norm should look like. Unfortunately, my lack of git skills made me screw up that PR so I closed it, planning to return to it after the new iteration syntax was completed.

However, having just become father for the second time a few days ago means that there are currently no "free time" slots in my calendar.

Sacha0 commented 6 years ago

having just become father for the second time a few days ago

Congratulations Jutho! 🎉

andyferris commented 6 years ago

Yes, congrats!

jebej commented 6 years ago

It seems like there might be some consensus forming around the idea of having both dot and inner, where:

  1. inner is a true recursive inner product
  2. dot = dot(x,y) = sum(x[i]'*y[i] for i = 1:length(x)) conjugated or not, and would therefore overlap with dot for Vector{<:Number} or Vector{<:Real}

Regarding:

Many people would write dot when they meant inner, and it would work fine for them in most cases, but then their code would do something unexpected if it is passed an array of matrices.

I don't believe that would be an issue. Since this is a fairly uncommon operation, I would expect people to at the very least try it to see what it does and/or go look at the documentation.

I think that the above would be a great change, and not very disruptive since the semantics of dot are not changed in most cases.

Sacha0 commented 6 years ago

It seems like there might be some consensus forming around the idea of having both dot and inner

To the contrary, the discussion from https://github.com/JuliaLang/julia/issues/25565#issuecomment-390069503 on appears to favor having one or the other but not both, as e.g. laid out in https://github.com/JuliaLang/julia/issues/25565#issuecomment-390388230 and well supported with reactions.

ranocha commented 6 years ago

Maybe inner (and also dot) should be recursive inner/dot/scalar products and the old behaviour could be implemented in functions such as dotc(x,y) = sum(x[i]' * y[i] for i in eachindex(x)) and dotu(x,y) = sum(transpose(x[i]) * y[i] for i in eachindex(x))? The names dotu and dotc would match corresponding BLAS names.

juliohm commented 6 years ago

(I agree that ⟨u,v⟩, ⟨u|v⟩, or (u,v) are more common notations for inner products on arbitrary Hilbert spaces—they are what I typically use myself—but those notations are a nonstarter for Julia. There was some discussion of parsing Unicode brackets as function/macro calls, e.g. #8934 and #8892, but it never went anywhere and this seems unlikely to change soon.)

@stevengj, when you added this paragraph to a previous comment by yourself, you meant that the syntax ⟨u,v⟩ is hard to implement in the language?

juliohm commented 6 years ago

Any chance this feature will make it to Julia v1.0? I have so many ideas and packages that depend on the notion of general inner products. Please let me know if I should lower my expectations. Sorry for the constant reminder.

jebej commented 6 years ago

Have you not seen #27401?

juliohm commented 6 years ago

Thank you @jebej and thank you @ranocha for taking the lead :heart:

stevengj commented 6 years ago

when you added this paragraph to a previous comment by yourself, you meant that the syntax ⟨u,v⟩ is hard to implement in the language?

Not technically hard to add to the parser, but it has proven difficult to come up with a consensus on how (and whether) to represent custom brackets in the language. See the discussion at #8934, which went nowhere 4 years ago and has not been revived since. (Add that to the fact that in different fields people use the same brackets for many different things, e.g. ⟨u⟩ is used for ensemble averages in statistical physics.) Another issue, raised in #8892, is the visual similarity of many different Unicode brackets.

juliohm commented 6 years ago

Thank you @stevengj, I appreciate the clarifications. I am already very excited that we are gonna have general inner products standardized across packages. :100: Maybe the angle bracket notation could shine in another release cycle in the future. Not as important, but quite convenient to be able to write code that is literally like the math in our publications.

StefanKarpinski commented 6 years ago

If ⟨args...⟩ is a valid syntax for calling the anglebrackets operator or something (what to call the function this syntax calls is actually kind of tricky since we don't have any precedent), then people could opt into whatever meaning they want for the syntax.

stevengj commented 6 years ago

@StefanKarpinski, the argument in #8934 was that it should be a macro. I don't think we ever came to a consensus.

(If we decide in Base that anglebrackets(a,b) means inner(a,b), that will discourage people from "opting into whatever meaning they want" because the decision will have already been made. It's not a terrible choice, of course, but it may be unnecessary to assign this a meaning in Base as long as they are parsed.)

StefanKarpinski commented 6 years ago

I don't recall the details of that discussion but making a macro seems like obviously a bad idea to me.

StefanKarpinski commented 6 years ago

With #27401 I think we can consider inner products to have been taking seriously.

stevengj commented 6 years ago

Traditionally, an issue is closed only when the relevant PR is merged...

StefanKarpinski commented 6 years ago

Sure, we can leave it open I guess. Just wanted to get it off the triage label.

ranocha commented 6 years ago

Should this be closed since #27401 is merged now?