State of inner products in Base

juliohm commented 6 years ago

If a user needs to define custom inner products for general Hilbert spaces, what is the current state in Base for this type of generalization? https://github.com/JuliaLang/julia/issues/16573 is a related, but less general issue. My concern is with new types that aren't arrays.

I'd like to propose renaming dot to inner, or perhaps directing users to define inner(x,y) as a general inner product between objects x and y, including the array case:

inner(x::AbstractVector, y::AbstractVector) = dot(x, y)

In case the change is reasonable, could it be part of Julia v1.0?

andyferris commented 6 years ago

Sure, sorry, I was perhaps a bit too imprecise with my language. There are obviously many valid norms for a vector space, including various operator norms.

I guess what I'm getting at is that maybe I'd prefer the choice of which norm to be more explicit than implicit? And that if you use the same function (without e.g. additional keyword arguments) you get the "same" norm, in which case the Euclidean seems like a somewhat defensible choice for AbstractArray.

andyferris commented 6 years ago

This is also a useful distinction between norm and innernorm. If you define norm, I would say that it implies only that you have a Banach space (or at least a normed vector space). If you define innernorm, it implies that you have a Hilbert space (or at least a inner product space) and that this norm is consistent with inner.

This does seem reasonable, but I'd still wonder why if an object has an innernorm it would need a different norm? I would alternatively propose that the interface for Banach space requires norm while an interface for inner product spaces would provide both norm and inner. These functions can then be used in generic code that expects objects of Banach or inner-product spaces as appropriate (EDIT: with the thought that code that works in Banach spaces will automagically also work on inner-product spaces).

stevengj commented 6 years ago

I think you're proposing that norm(x) always refer to some kind of element-wise Euclidean norm (i.e. a Frobenius norm for matrices), i.e. basically what vecnorm is now modulo the recursive case. In this case we might as well redefine dot(x,y) to be the corresponding inner product (inner works too, but dot has the advantage of an infix variant x ⋅ y).

I'm fine with this in principle, but this would be a breaking change and it might be a little late before 0.7 to get that in…

o314 commented 6 years ago

Is L2 a good default in high dimensional too ? This article talks about distance, but may be it can concern norm too https://stats.stackexchange.com/questions/99171/why-is-euclidean-distance-not-a-good-metric-in-high-dimensions

juliohm commented 6 years ago

In this case we might as well redefine dot(x,y) to be the corresponding inner product (inner works too, but dot has the advantage of an infix variant x ⋅ y)

Can we get rid of dot entirely? The infix notation should be unrelated to the existence of a function called dot. Just define the infix with the inner method for Julia arrays. Is that possible?

juliohm commented 6 years ago

That is really what it is, the dot product: a convenient notation x ⋅ y for inner products between x and y vectors in R^n with Euclidean geometry.

andyferris commented 6 years ago

@stevengj I think that's a good summary, yes.

@o314 Is L2 a good default in high dimensionality? Possibly not, but I'd really hate it if e.g. the norm chosen by norm(v::AbstractVector) depended on length(v) :) I'd equally not like it to second guess whether my matrix or higher-dimensional array is "too big for L2" - I'd suggest that perhaps this should be explicitly marked by the user?

@juliohm That's definitely possible, though like mentioned, these are breaking changes we're suggesting. (Again, modulo what to do in the recursive case and earlier discussions on the possible differences between inner and dot).

Jutho commented 6 years ago

@stevengj, my interpretation of what @andyferris was implying is that, because of duck typing, it is hard to decide whether a user wants to interpret an object as a vector (and use a corresponding vector p-norm) or as an operator (and compute an induced p-norm). So I think there is no choice but to specify explicitly what behaviour is wanted. The current approach is a bit odd in the sense that norm tries to guess implicitly whether to choose vector norm or induced norm based on the input, and vecnorm is a way of explicitly specifying that you want the vector norm (which is also why I don't find vecnorm such a bad name). A more radical change would be to make norm always default to the vector norm, and specify explicitly when you want the induced norm, using a (keyword) argument or a different function altogether.

On the other hand, I also don't mind the name innernorm, which is explicit in that this is an inner product based norm (i.e. always p=2 in the Euclidean case). I find it hard to judge wether for custom objects (vec)norm should support an optional argument p as part of the interface, since in some of my use cases, only p=2 is easy to compute.

That is really what it is, the dot product: a convenient notation x ⋅ y for inner products between x and y vectors in R^n with Euclidean geometry.

I agree with this, in the sense that I don't recall ever having seen the notation x ⋅ y in the context of general (e.g. complex) vector spaces. I think only the mathematical notation (x,y) or the Dirac notation < x | y > is used in such cases. In electromagnetism one often uses E ⋅ B for vectors in 3-dimensional Euclidean space, and even if one uses complex notation (i.e. phasors) this does not imply complex conjugation. If needed, complex conjugation is denoted explicitly in such cases. So I wouldn't mind if dot just became sum(x_i * y_i) without complex or Hermitian conjugation, and inner became the correct inner product for general inner product spaces. Unfortunately, this can probably not be done in a single release cycle.

o314 commented 6 years ago

Is L2 a good default in high dimensionality? Possibly not, but I'd really hate it if e.g. the norm chosen by norm(v::AbstractVector) depended on length(v) :) I'd equally not like it to second guess whether my matrix or higher-dimensional array is "too big for L2" - I'd suggest that perhaps this should be explicitly marked by the user?

I work in the BIM world where we handle 2d and 3d, but also 4d, 5d, 6d may be 7d. We never go further. At any point we know in which dimensions we work, and which algo is involved. That's largely enough.

I can not express the pov of people who work in ML, information retrieval etc. There, may be norminf is better. What is important in my pov is guessability and stability. I will not be shocked at all if people in ML needs different default for their stuff. If there is no confusion. Eg. it is decided explicitly and statically at compile time. It's even luxury if it is remains stable and consistant during algos application.

Inspired from array:similar Not fully implemented and test it.

norm2 = x -> x |> inner |> sqrt
norminf = ...
NMAX = 10
for N in 1:NMAX
    @eval begin norm(a::Array{T,N}) where {T} = norm2 end
end
norm(a::Array{T,n}) where {T} = norminf

o314 commented 6 years ago

Can we get rid of dot entirely? The infix notation should be unrelated to the existence of a function called dot. Just define the infix with the inner method for Julia arrays. Is that possible?

norm(x::AbstractVector, p::Real=2) = vecnorm(x, p) # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L498
vecdot(x::Number, y::Number) = conj(x) * y # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L657
dot(x::Number, y::Number) = vecdot(x, y) # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L659
function dot(x::AbstractVector, y::AbstractVector) # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L677

# Call optimized BLAS methods for vectors of numbers
dot(x::AbstractVector{<:Number}, y::AbstractVector{<:Number}) = vecdot(x, y) # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L698

Dot / vecdot implies to use conjugate and to decide when to go to BLAS. this have to be handle somewhere. But this should be manageable in a single namespace.

stevengj commented 6 years ago

Is L2 a good default in high dimensionality? Possibly not

L2 is also the most common norm for infinite-dimensional spaces (e.g. functions). I think it is a reasonable default to expect for any vector space.

Obviously you want to have other norms available, too. If we redefine norm(x) to be elementwise L2 wherever possible, then norm(x, p) would be elementwise Lₚ, and we'd need some other function (e.g. opnorm) for the corresponding induced/operator norms.

I agree with this, in the sense that I don't recall ever having seen the notation x ⋅ y in the context of general (e.g. complex) vector spaces.

I gave several citations in another thread, IIRC (e.g. BLAS uses dot for complex dot product, and you can find pedagogical sources even using the term for inner products of functions). The very term "inner product" is usually introduced a "a generalization of a dot product". I don't think anyone will be too surprised by the notation of dot for a Euclidean inner product, and it is convenient to have an infix operator.

We could keep dot as-is and introduce inner, of course, but I think that would create a confusing dichotomy — in the most common cases, the functions would be equivalent, but in odd cases (e.g. arrays of matrices) they would differ.

But again, it might be a little late for breaking changes, so we might have to resort to innernorm and inner. In any case, someone would need to create a PR ASAP.

Sacha0 commented 6 years ago

If a reasonable measure of consensus forms, I may be able to devote some bandwidth to exploring implementation on a relevant (short) timescale, potential breaking changes included. I appreciate the drive to clarify these operations' semantics and give them explicit names. Best!

stevengj commented 6 years ago

I see two main options:

Non breaking, adds a feature: inner(x,y) and innernorm(x). Replacing vecdot and vecnorm, and recursive for arrays of arrays.
Breaking: change norm(x,p=2) to be always elementwise and recursive, replacing vecnorm, and introduce a new function opnorm for the operator/induced norm. Make dot(x,y) the corresponding elementwise dot product, replacing vecdot. (Alternative: rename to inner, but it's nice to have an infix operator, and it is annoying to have both dot and inner.)

If I were designing things from scratch I would prefer 2, but it might be too disruptive to silently change the meaning of norm.

One intermediate option would be to define inner and innernorm (deprecating vecdot and vecnorm), and deprecate norm(matrix) to opnorm. Then, in 1.0, re-introduce norm(matrix) = innernorm(matrix). That way, people can eventually just use inner and norm, and we leave dot as the current odd beast for vectors-of-arrays (coinciding with inner for vectors of numbers).

stevengj commented 6 years ago

One oddity about innernorm is you want a way to specify the L1 or Linf "elementwise" norms, but neither of these corresponds to an inner product so innernorm(x,p) is a bit of a misnomer.

Jutho commented 6 years ago

I like your intermediate option.

As stated above, I like the name innernorm(x) because it implies p=2 and there shouldn't be a second argument . I have objects for which I only know how to compute the inner product norm. But with the current (vec)norm, it is unclear to me if the p argument is part of the assumed Base interface, and so I don't know whether to omit the second argument, or to support it but then check explicitly for p != 2 and yield an error.

Jutho commented 6 years ago

But I see the problem with not having any non-deprecated way of doing vecnorm(matrix, p!=2) during the intermediate stage of your proposal.

andyferris commented 6 years ago

I also like the intermediate option - we definitely want to go through a proper cycle of deprecation for the norms rather than make an immediate breaking change. (As a user, the breaking changes scare me, but I see fixing deprecations in my code for v1.0 are like an investment in clean, clear code for the future).

Would we actually need innernorm or could we just use vecnorm for now (and deprecate vecnorm in favor of norm later)?

I actually don't see any potential uproar in simply replacing dot with inner... I too think it's clear enough that inner product is meant to be a generalization of dot products.

juliohm commented 6 years ago

Changes could be implemented in two separate PRs:

Replace dot with inner and give it the generalized meaning. Optionally, make the infix \cdot notation point to inner between Julia arrays.
More discussion and deprecation cycles around the norm variants and terminology.

My understanding is that PR 1 could be merged before Julia v1.0. It is not breaking.

stevengj commented 6 years ago

Replacing dot with inner would still be breaking because dot is currently not a true inner product for arrays of arrays — so you would be changing the meaning, not just renaming. I'm for changing the meaning to be a true inner product, but if you change the meaning (defining it as the true inner product) I don't see the problem in continuing to spell it as dot.

So, we could do the following in 0.7:

Deprecate norm(matrix) to opnorm(matrix) and norm(vector of vectors) to vecnorm.
Deprecate dot([vector of arrays], [vector of arrays]) to a call to sum.
Say that vecdot(x,y) and vecnorm(x, p=2) are Euclidean inner products/norms (for p=2), and make them recursive (which is slightly breaking, but in practice probably not a big deal).

Then, in 1.0:

Deprecate vecnorm to norm and vecdot to dot. (Not sure if this is allowed by the 1.0 release rules, @StefanKarpinski?)

stevengj commented 6 years ago

(Note that the numpy.inner function, amazingly, is not always an inner product. But NumPy's terminology on inner and dot has been weird for a while.)

stevengj commented 6 years ago

The reasons I prefer to continue spelling it as dot:

It is nice to have an infix spelling.
For non-mathematicians operating on ordinary finite-dimensional vector spaces, dot is a more familiar name for the Euclidean inner product. (Mathematicians will easily adjust to using the name dot for the inner-product function on arbitrary Hilbert spaces—"dot product" has no other possible meaning for such spaces.)
Having both inner and dot would be confusing, since they would coincide in some cases but maybe not others (if we keep the current dot meaning).
Outside of linear algebra, inner has a lot of other potential meanings in computer science, and it hence it is somewhat annoying to export this name from Base.

juliohm commented 6 years ago

Can you elaborate on your opposition to the name inner? I still don't get it why you prefer to go against a terminology everyone on this thread seems to agree?

On Tue, May 15, 2018, 5:13 AM Steven G. Johnson notifications@github.com wrote:

(Note that the numpy.inner https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.inner.html function, amazingly, is not always an inner product.)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaLang/julia/issues/25565#issuecomment-389144575, or mute the thread https://github.com/notifications/unsubscribe-auth/ADMLbdcpeWo7M4prYz76NoqUPIkfVPP3ks5tysZlgaJpZM4ReGXu .

juliohm commented 6 years ago

None of the reasons are compelling to me:

It is nice to have an infix variant.

Yes, and the infix notation can still exist regardless of the rename to inner as explained above.

For non-mathematicians operating on ordinary finite-dimensional vector spaces, dot is a more familiar name for the Euclidean inner product. (Mathematicians will easily adjust to using the name dot for the inner-product function on arbitrary Hilbert spaces—"dot product" has no other possible meaning for such spaces.)

This argument is not good: let's teach ordinary people the wrong terminology because they are lazy and can't learn a new appropriate word, and force mathematicians to use the wrong terminology against their will.

Having both inner and dot would be confusing, since they would coincide in some cases but maybe not others (if we keep the current dot meaning).

We don't need both, get rid of the less general name, which we agree is dot at this point.

Outside of linear algebra, inner has a lot of other potential meanings in computer science, and it hence it is somewhat annoying to export this name from Base.

Outside of linear algebra I can find many uses for dot. Even more for the dot infix notation meaning completely different things.

StefanKarpinski commented 6 years ago

I'm reposting @juliohm's last post with fixed formatting.

None of the reasons are compelling to me:

It is nice to have an infix variant.

Yes, and the infix notation can still exist regardless of the rename to inner as explained above.

For non-mathematicians operating on ordinary finite-dimensional vector spaces, dot is a more familiar name for the Euclidean inner product. (Mathematicians will easily adjust to using the name dot for the inner-product function on arbitrary Hilbert spaces—"dot product" has no other possible meaning for such spaces.)

This argument is not good: let's teach ordinary people the wrong terminology because they are lazy and can't learn a new appropriate word, and force mathematicians to use the wrong terminology against their will.

Having both inner and dot would be confusing, since they would coincide in some cases but maybe not others (if we keep the current dot meaning).

We don't need both, get rid of the less general name, which we agree is dot at this point.

Outside of linear algebra, inner has a lot of other potential meanings in computer science, and it hence it is somewhat annoying to export this name from Base.

Outside of linear algebra I can find many uses for dot. Even more for the dot infix notation meaning completely different things.

stevengj commented 6 years ago

Yes, and the infix notation can still exist regardless of the rename to inner as explained above.

You can certainly define const ⋅ = inner, but then your terminology is inconsistent. I thought you didn't like using the "dot product" as a general inner product?

force mathematicians to use the wrong terminology against their will

Mathematicians know that terminology is neither right nor wrong, it is only conventional or unconventional (and maybe consistent or inconsistent). (And most people don't go into mathematics because they have a passion for prescriptive spelling.) In my experience, if you tell mathematicians that in quantum mechanics a vector is called a "state", the adjoint is called "dagger", and a dual vector is called a "bra", they are sublimely unconcerned. Similarly, I don't think any experienced mathematician will blink more than once if you tell them that in Julia an inner product is spelled dot(x,y) or x ⋅ y, especially since the terms are already understood to be synonyms in many contexts. (I doubt you will find any mathematician who does not know instantly that you are referring to an inner product if you say "take the dot product of two functions in this function space".)

On the other hand, for people who aren't trained mathematicians and haven't been exposed to abstract inner-product spaces (i.e. the majority of users), my experience is that unfamiliar terminology is more of an obstacle. "How do I take a dot product of two vectors in Julia?" will become a FAQ.

There really is no mathematical difficulty here to be solved aside from choosing the semantics. The spelling question is purely one of convenience and usage.

Outside of linear algebra I can find many uses for dot. Even more for the dot infix notation meaning completely different things.

Except that Julia and many other programming languages have had dot for years and it hasn't been a problem. inner would be new breakage.

Ultimately, the spelling of this (or any other) function is a minor matter compared to the semantics and the deprecation path, but I think the balance tips in favor of dot.

juliohm commented 6 years ago

You can certainly define const ⋅ = inner, but then your terminology is inconsistent. I thought you didn't like using the "dot product" as a general inner product?

I think you still don't get it. There is no inconsistency in calling dot an inner product. It is an inner product, a very specific and useless one for many of us. Nothing more than sum(x.*y).

If the term dot ends up in Julia having the semantics of inner, this will be a historical disaster that I can guarantee to you many will feel annoyed. I can foresee professors in a classroom explaining things like: "You know, we are now gonna define the inner product for our space, but in Julia someone (@stevengj) decided to call it dot."

I will make sure I will screenshot this thread for future reference if that ends up happening.

You are the only one @stevengj insisting on the dot terminology, no one else else has manifested opposition to it. It would be nice if you could reconsider this fact before making a decision.

stevengj commented 6 years ago

It is an inner product, a very specific and useless one for many of us. Nothing more than sum(x.*y).

If you think "dot product" can only refer to the Euclidean inner product in ℝⁿ, then you shouldn't define const ⋅ = inner, you should define only ⋅(x::AbstractVector{<:Real}, y::AbstractVector{<:Real}) = inner(x,y).

You can't have it both ways: either inner can use ⋅ as an infix synonym (in which case the infix operator is both "wrong" in your parlance and the naming is inconsistent) or it doesn't have an infix synonym (except in one special case).

I can foresee professors in a classroom explaining things like: "You know, we are now gonna define the inner product for our space, but in Julia someone (@stevengj) decided to call it dot."

Ha ha, I'm willing to take the heat from this imaginary outraged professor. Seriously, you need to look around more if you think the term "dot product" is only ever used in ℝⁿ, or that mathematicians are outraged if the term is used in other Hilbert spaces.

this will be a historical disaster

Seriously?

ararslan commented 6 years ago

This discussion seems to be eroding beyond what one might consider a welcoming, civil and constructive environment. Opinions and backgrounds differ, but please refrain from making personal attacks or placing blame on anyone and assume all parties are debating for their point in good faith.

ararslan commented 6 years ago

I can foresee professors in a classroom explaining things like: "You know, we are now gonna define the inner product for our space, but in Julia someone (@stevengj) decided to call it dot."

It may also be worthwhile here to note that Steven is a professor. :wink:

jebej commented 6 years ago

I am also on the fence about removing dot in favor of inner. The dot term is quite widely used, and not having the function in Julia, when it is in Python and MATLAB would be surprising. However, I do also like the term inner, given it is more appropriate for non-ℝⁿ vector spaces, and especially matrices.

Incidentally, while I was testing what methods were doing in Julia, I noticed that dot only works on real vectors/matrices. Is that intentional?

Having both inner and dot would be confusing, since they would coincide in some cases but maybe not others (if we keep the current dot meaning).

@stevengj Would it be completely ridiculous to replace vecdot with inner, and also keep dot? Right now, that exact problem you are describing exists already, just with vecdot instead of inner.

andyferris commented 6 years ago

OK... looking forward, what are the live suggestions? Are they to:

Embrace dot as a generic inner product for a wider range of types. It's already correctly recursive on vectors-of-vectors, but we would make it work on matrices, etc (@jebej I don't feel having both dot and inner is that useful, and as Steven says, we at least colloquially use dot to mean inner product quite often, and this not incorrect - it's just terminology).
Consider making norm a bit more consistent with the above dot and across all AbstractArray, eventually introducing e.g. opnorm for operator norms (on AbstractMatrix) and having (in new-to-old notation) norm(matrix) == vecnorm(matrix) after suitable deprecations. At this point perhaps we don't need vecdot and vecnorm anymore?

Is that right? I think these would at least get us to a relatively consistent linear algebra story with "clean" interfaces, where generic code can use dot and norm as a reliable pair for working with inner-product spaces independent of type.

stevengj commented 6 years ago

@andyferris, yes, I think if we make this change then we only need dot and norm (which are now the recursive Euclidean operations on arrays or arrays-of-arrays of any dimensionality, though for norm we also define norm(x,p) to be the p-norm) and opnorm, and no longer have vecdot or vecnorm.

Note that the change to dot is a breaking change because dot is currently not a true inner product for vectors of matrices (#22392), something that was debated for a long time in #22220 (at which point eliminating vecdot was not considered IIRC). However, that was introduced in 0.7, so it doesn't break any actual released code. In fact, dot in 0.6 is already the Euclidean dot product on arbitrary-dimensionality arrays, somewhat by accident (#22374). The suggested change here would restore and extend that 0.6 behavior and change norm to be consistent with it.

One question is whether norm(x,p) would call norm(x[i]) or norm(x[i],p) recursively. Both are potentially useful behaviors. I lean towards the former because it is more general — x[i] may be some arbitrary normed vector space that only defines norm but not the p-norm. Calling norm recursively is also what vecnorm does now, so it is consistent with deprecating vecnorm to norm.

stevengj commented 6 years ago

@jebej, dot on both master and 0.6 works for me on complex arrays: dot([3im],[4im]) correctly returns 12+0im, for example.

stevengj commented 6 years ago

Another good point about changing norm(matrix) to be the Frobenius norm is that is a lot cheaper. It is common to just use norm(A-B) to get a sense of how big the difference between two matrices is, but not to care too much about the specific choice of norm, but many users won't realize that the current default norm(matrix) requires us to compute the SVD.

Sacha0 commented 6 years ago

Wonderful to see consensus forming around several major points! :) (Unless someone beats me to it (please do if you have bandwidth!) or an alpha tag hits prior, I will give implementing the present consensus points a shot after shipping #26997.) Best!

juliohm commented 6 years ago

Another link for future reference: https://math.stackexchange.com/a/476742

To illustrate the poor naming that is being adopted here consciously, and the poor decision imposed by a single mind. Dot and inner products have different mathematical properties. You are forcing a whole community against what is well known in the mathematics literature.

juliohm commented 6 years ago

And for future readers, what should have been done instead had we had a collective decision:

# make dot what it is, a NOTATION
⋅(x::AbstractVector, y::AbstractVector) = sum(x[i]*y[i] for i in indices(x))

# replace the name dot by the more general inner
inner(x, y) = # anything

stevengj commented 6 years ago

I guess we will just be the first people in the universe to employ the term "dot product" for an inner product on anything but ℝⁿ. It's a good thing I was able to impose my will on this thread (mainly by blackmailing the other developers) to force this innovation into the world! No longer will the dot product be relegated to mere "notation": instead, it will be a symbol that means an inner product (as all should know, assigning meanings to symbols is the opposite of "notation").

juliohm commented 6 years ago

Very good decision making :clap: it was definitely a consensus. Read the comments above, and you will see how everyone agreed. :+1:

juliohm commented 6 years ago

Or maybe I should quote some comments so that it is very clear how it was a consensus:

Right - vecdot could be renamed inner

by @andyferris

Option 2 (probably better): use more mathematically correct names

inner dimension But what to do with norm?

by @Jutho

I agree, as an alternative to vecdot we could introduce a new method inner

by @Jutho

I also find the vecdot name odd, in fact, I didn't even know it existed and had made my own function for it... called inner.

by @jebej

And many more...

stevengj commented 6 years ago

People can debate vociferously with one another, and raise many points of disagreement, but still arrive at a consensus (albeit not always unanimity) by being persuaded and by balancing the pros/cons. (I agree that there are both pros and cons of each option here.) I'm sorry that the result which seems (tentatively!) to be gelling here is not the outcome that you preferred, but I'm not sure how you think I "imposed" my will.

(Not that any final decision has been made, of course — there isn't even a PR yet, much less anything merged.)

juliohm commented 6 years ago

I only wish we could make a decision that is based on the audience of the language. If someone picks Julia as a tool, I am sure the person has at least heard of the term inner product. It is a quite popular concept and far from being exotic. Exotic things include "persistent homology", "quantum theory", this are less widely spread, and I would be against including this type of terminology.

After all I just want to have a language that is the best language for scientific computing, math, etc.

stevengj commented 6 years ago

@juliohm, all of the arguments have been based on the needs of who we think the audience is, and all of us are trying to make Julia as good a language as possible. Reasonable people can come to different conclusions about terminology, since mathematics does not determine spelling.

Jutho commented 6 years ago

Firstly, as mentioned above, I can certainly agree with @stevengj 's current proposal and sticking to dot as the general name for inner product. Also, I dislike the way this discussion is going and would certainly like to be quoted correctly. @juliohm, the second quote you attribute to me is not mine.

That being said, I would like to mention the following as food for thought in the consideration of pros and cons. The following are mostly cons, but I agree with the pros mentioned by @stevengj. There could easily be separate use cases for having dot just mean sum(x[i]*y[i] for i ...). In the cases where the infix dot notation is most used in mathematics, this is indeed typically the meaning. As an inner product, the infix dot notation is typically (though certainly not exclusively) reserved for real vector spaces. Other use cases include enabling things like σ ⋅ n with σ a vector of Pauli matrices and n a vector of scalars. This was one of the motivations behind the way dot is currently implemented, as was pointed out to me in some other thread. The fact that BLAS decided to only use dot for real vectors and make a distinction between dotu and dotc for complex vectors is another issue to consider. People with BLAS background might get confused whether, having complex vectors, they want to compute dot(conj(u),v) or dot(u,v) when they want the true inner product (i.e. dotc). Furthermore, they might look for a way to do dotu without first making a conjugate copy of the vector at hand.

juliohm commented 6 years ago

@Jutho the quote is yours, your full comment is copied below:

I agree, as an alternative to vecdot we could introduce a new method inner, but I don't know of a good name to "replace" vecnorm. In fact, I don't find vecnorm that bad, vector norm is a well established and explicit term for the operation we want.

In any case, the quoting is intended to show what is the desire of many here (at least as a first natural thought) when we think about this subject. If you changed your desire over time, that is another story. I myself would never pop up the term "dot" out of my head during any modeling with Hilbert spaces. It feels unnatural and inconsistent with what I learned.

stevengj commented 6 years ago

@Jutho: Furthermore, they might look for a way to do dotu without first making a conjugate copy of the vector at hand.

The possibility of exporting a dotu function has come up from time to time (see e.g. #8300). I agree that this is sometimes a useful function: an unconjugated Euclidean "inner product" (not really an inner product anymore) that is a symmetric bilinear (not sesquilinear) form dotu(x,y) == dotu(y,x) (not conjugated) even for complex vector spaces. But the utility of that operation is not limited to ℂⁿ — for example, this kind of product often shows up in infinite-dimensional vector spaces (functions) for Maxwell's equations as a consequence of reciprocity (essentially: the Maxwell operator in typical lossy materials is analogous to a "complex-symmetric matrix" — symmetric under the unconjugated "inner product"). So, if we define dot(x,y) to be the general Euclidean inner product (with the first argument conjugated), it would be quite natural to define a dotu(x,y) function for the unconjugated Euclidean product on any vector space where it makes sense. I don't see the possibility of a dotu function as an argument against dot, however. In the majority of cases, when you are working with complex vector spaces you want the conjugated product, so this is the right default behavior.

But I agree that one possibility would be to define dot(x,y) = sum(x[i]'*y[i] for i = 1:length(x)), which is how it's currently defined in master (not 0.6), and define inner(x,y) as the true dot product. This has the advantage of supplying both functions, both of which may be useful in certain cases. However, we then have two functions that almost always coincide except for arrays of matrices, and I suspect it would be a bit confusing to decide when to use one or the other. Many people would write dot when they meant inner, and it would work fine for them in most cases, but then their code would do something unexpected if it is passed an array of matrices. My suspicion is that in 99% of cases people want the true inner product, and the "sum of product" version can be left to a package, if indeed it is needed at all (as opposed to just calling sum).

Jutho commented 6 years ago

@juliohm , I misread your post as I thought the names were above (instead of below) the respective quotes, hence I thought you attributed the quote of @jebej to me. My apologies for that.

@stevengj, I certainly was not thinking of having dot(x,y) = sum(x[i]'*y[i] for i = 1:length(x)) as a reasonable default. In the case like σ ⋅ n, the complex/hermitian conjugation of the first or second argument is unnecessary. So what I was saying is that, in many (but indeed not all) cases where the infix dot notation is used in scientific formulas, its meaning coincides with dotu, i.e sum(x[i]*y[i] for i = 1:length(x)) without conjugation, either as inner product on real vector spaces or as some more general construction.

So if I were to make an alternative proposal (though I am not necessarily advocating it), is to have two functions:

dot(x,y) = sum(x[i]*y[i] for i...), which is still be the correct inner product for real vectors (which is likely the use case of the people who are less or not familiar with the term inner product) but also allows more general constructions like σ ⋅ n, and is thus the function corresponding to the infix notation
inner(x,y) being the always valid inner product, with conjugation and recursion, that will be used by people in more general en technical contexts.

I am not defending this as a good choice to adopt in the Julia Language, but I do think this is how it is used in much of the literature. When infix dot is used, it is either as an inner product in the context of real vectors, or in some more general construction where it just means contraction. When a general inner product on arbitrary vector spaces is intended, most scientific literature (but you certainly have shown counter examples) switches to <u,v> or <u|v> (where in the first notation there is still discussion which of the two arguments is conjugated).

I could live with this proposal, but I could equally well live with having only dot as the general inner product. In the end, it's a matter of having good documentation, and I too cannot believe that anyone would stumble over this "design" choice.

stevengj commented 6 years ago

@Jutho, I agree that it is not uncommon to define dot to just mean contraction. Certainly, one can find examples both ways. For example, in programming languages and popular libraries:

Unconjugated: Numpy dot (and, bizarrely, inner), Mathematica's Dot, Maxima ., BLAS dotu
Conjugated: Matlab's dot, Fortran's DOT_PRODUCT, Maple's DotProduct, Petsc's VecDot, Numpy vdot, BLAS dotc (note that the lack of overloading in Fortran 77 made it impossible to call this dot even if they wanted to), Eigen's dot

On the one hand, the conjugated inner product is usually introduced in textbooks as the "natural" extension of the "dot product" notion to complex vectors — the unconjugated version is in some sense an "unnatural" extension, in that it is usually not what you want. (Consider the fact that, of the languages that provide a conjugated dot function in their standard libraries — Matlab, Fortran, Julia, Maple — only Maple provides an unconjugated variant, hinting at a lack of demand.) On the other hand, an unconjugated dotu function is convenient (as a supplement) in certain special cases (some of which I mentioned above).

If we have both dot and inner, I suspect that many people will end up using dot by accident when they really want inner for their code to be generic. (I'd bet that Numpy's inner is unconjugated due to just such an accident — they implemented it with real arrays in mind, and didn't think about the complex case until it was too late to change so they added the awkwardly named vdot.) Whereas if we have dot and (possibly) dotu, it will be clearer that dot is the default choice and dotu is the special-case variant.

(I agree that ⟨u,v⟩, ⟨u|v⟩, or (u,v) are more common notations for inner products on arbitrary Hilbert spaces—they are what I typically use myself—but those notations are a nonstarter for Julia. There was some discussion of parsing Unicode brackets as function/macro calls, e.g. #8934 and #8892, but it never went anywhere and this seems unlikely to change soon.)

Jutho commented 6 years ago

I fully agree with your assessment @stevengj .

andyferris commented 6 years ago

Me too.

I suspect it’s time for one of us to play with either implementation in a PR and see how it comes out.

@Jutho I always saw the dot product with Pauli matrices as shorthand for a contraction over higher order tensors... one of the vector spaces is real, 3D.

JuliaLang / julia

State of inner products in Base #25565