Closed Sacha0 closed 6 years ago
The usual acronyms are LU, QR, SVD. Unfortunate perhaps but well established, so it would be better to have consistency across these factorization rather than consistency of the name itself. Reading svdecomp
for instance makes me wonder what an ecomp
is.
It never occurred to me, but now that you mention it, it does sound weird. However, I don't ever expand the SVD in my mind when I think about it, and perhaps it is ok to let it be as it is.
A broader thought in the same vein: At the moment we mix the terms decomposition and factorization (e.g. Computes the eigenvalue decomposition of A, returning an Eigen factorization object F [...]
). Decomposition is the more common and slightly more general term. Perhaps we should use decomposition consistently? Best!
I believe there was a discussion thread on which name to use, and we picked factorizations early on. We should dig up and link the original discussion at the very least.
A bit of git spelunking revealed the following history: Miles Lubin introduced the name lufact
confined to an UMFPackLU wrapper via https://github.com/JuliaLang/julia/commit/53c2d3054af43d2075dcc70f7d2f811e519f986f. Later, Doug Bates introduced *d
names, e.g. lud
and qrd
, for decompositions broadly via #1281 and #1290. In #1281, Viral pointed out that the *d
names were opaque, and suggested extending the UMFPackLU lufact
to *fact
/ "factorization" generally instead. Shortly thereafter, Tim Holy suggested *dcmp
/"decomposition" as an alternative, and Doug Bates expressed remorse for introduction of "factorization" terminology and a preference for *decomp
/"decomposition", but said he'd go with either decision. Viral responded saying he would be happy with either name and left the call to Doug, though he liked *fact
's brevity. Mike shared a little support for "decomposition" then, and Doug likewise. That's where the conversation appears to leave off. Viral later committed https://github.com/JuliaLang/julia/commit/69e407b00c62a2c81327d0625b2a7fa6cb83aeeb, renaming the *d
functions to *fact
, and here we find ourselves :).
Out of curiosity, I checked the number of google hits for "X decomposition" and "X factorization", and while I had the impression that decomposition was the more widespread term, the degree to which that appears true surprised me; results in millions below:
Update regarding the table below: These hit counts were for unquoted search queries, whereas quoted search queries are probably a better metric. With quoted queries, which term hit counts favor depends on the decomposition, and the results are much less compelling overall. Ref. https://github.com/JuliaLang/julia/issues/26995#issuecomment-390808271.
X | decomp | fact |
---|---|---|
lu | 15.3 | 1.07 |
qr | 12.3 | 0.34 |
singular value | 2.58 | 0.68 |
eigen | 0.41 | 0.08 |
cholesky | 0.4 | 0.2 |
schur | 0.28 | 0.43 |
Tangentially, the history suggests that the only reason for the *fact
/*d
names was to retain MATLAB compatibility in lu
, qr
, and friends. But with the MATLAB-like functions lu
, qr
, et al now being deprecated in favor of *fact
, deprecating the *fact
names to lu
, qr
, et al becomes possible in 1.x (discussed briefly in https://github.com/JuliaLang/julia/pull/25187). Best!
Thank you for that detailed analysis!
Yes, I'm very much in favor of making breaking changes to LinearAlgebra 2.0 in some Julia 1.x release where the names are just lu
, schur
, chol
, etc. but the objects returned are factorization objects. We can retain the ability to write code like L, U = lu(X)
but defining iteration of the factorization objects to yield the expected components. Let's spend the intervening time thinking about what the best design for this kind of API would be without any historical baggage.
Yes, I'm very much in favor of making breaking changes to LinearAlgebra 2.0 in some Julia 1.x release where the names are just lu, schur, chol, etc. but the objects returned are factorization objects. We can retain the ability to write code like L, U = lu(X) but defining iteration of the factorization objects to yield the expected components.
Agreed! And likewise Andreas it seems. #26997 should at least set us up for those changes during 1.x, and potentially non-breaking then. Best!
I am not sure how these Google hits were computed, but I don't get anything above 150,000-ish on anything, and even so, no more than 13-14 pages of results.
I am not sure how these Google hits were computed, but I don't get anything above 150,000-ish on anything, and even so, no more than 13-14 pages of results.
The difference is quoting versus not quoting the search query :).
I think one ought to quote it, which is what I thought you did since you did say "X decomposition"
and "X factorization"
. Even if I do it without quotes for lu, I get the same numbers roughly, about 1.2-1.3M, and not 15.3M vs. 1M. I don't think Google hits are a reliable way to decide this.
A slack conversation convinced me that the hit counts for quoted search queries are a better metric, and for such queries which term the hit count favors depends on the particular decomposition; in other words, ignore the table above, as it's probably not the best guide. The remaining question is a minor one of correctness, in that e.g. decomposition is in some cases perhaps more correct for eig
than factorization, but whether that's worth bothering about 🤷‍♂️. Best!
I would point out that while the eigenvectors and eigenvalues are not a factorization as a pair—you can’t multiply them and get the original matrix back—the factorization object does act as a true factorization in that you can use it in place of the original matrix as “pre-factorized” stand in. Moreover, you can get one of these objects through a funcrion called, yes, factorize
, not decompose
.
A little further slack triage settled on the status quo, i.e. retaining factorize
/Factorization
(and I imagine consequently continuing to use somewhat mixed decomposition/factorization terminology). Best!
First is imho, i have spent thousand of hours working on math, engineering and ontology, terminology call it like you want.
Factorization is grounded into arithmetics. when the matrix field is numerical, frequently factorization pops here and there.
Composition is more compatible with the evolution toward symbolic programming.
it's a natural movement found when trying to solve equation (math work) or assemblying things (engineering work). like with dynamic programming, we break a problem in multiple piece and with the property of the zero element / absorbing element of a groupw we solve the whole constraining either one part or the other thanks to the law of excluded middle
secondly, some facts
wolfram is matrix decomposition everywhere.
and google scholar too when you go to symbolic computing
search | hits |
---|---|
"category decomposition" | 278 |
"category factorization" | 27 |
"graph decomposition" | 7880 |
"graph factorization" | 700 |
"ideal decomposition" | 1170 |
"ideal factorization" | 795 |
"lattice decomposition" | 731 |
"lattice factorization" | 261 |
"monad composition" | 108 |
"monad decomposition" | 7 |
"monad factorization" | 1 |
bonus guess how to check the pantelides thing
IMHO julia in the large is better served with decomposition than factorization. in linear algebra, factorization may however remain more commons.
Can this be closed now that the factorization functions no longer have fact
in their names or would people like to discuss this topic further?
I guess what's left is to decide if we should rename Factorization
to Decomposition
.
I see. Even if decomposition was slightly better than factorization (which I don't think) then it's not worth the name change.
After extensive discussion we concluded that the two words are used roughly as frequently when talking about matrices but that "factorization" is much more matrix-specific and thus conveys more information. It's also the one we're already using and it's no longer very user-facing, so we do nada.
Singular value decomposition factorization (
svdfact
) is a slightly unfortunate name; minor a thing as it is, the redundancy chafes. Perhapssvdecomp
,svfact
, or something similar would be better? Best!