JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.03k stars 5.43k forks source link

Deprecate `ones`? #24444

Closed timholy closed 6 years ago

timholy commented 6 years ago

Now that we distinguish one and oneunit, ones(T, sz) seems like a misnomer. Deprecate in favor of fill(oneunit(T), sz)? If so, should we ditch zeros too?

fredrikekre commented 6 years ago

xref https://github.com/JuliaLang/julia/issues/11557#issuecomment-339776065 and below, and also @Sacha0's PR #24389

Sacha0 commented 6 years ago

I have work in progress to this effect that I hope to post in the next day or two :). Best!

GunnarFarneback commented 6 years ago

To me fill(oneunit(T), sz) looks like a non-trivial loss in readability compared to ones(T, sz).

Sacha0 commented 6 years ago

Note that you rarely need write something as verbose as fill(oneunit(T), sz), as usually a literal or something similarly compact suffices in place of oneunit(T). Shorter incantations may also become possible with future changes to array constructors. Additionally, once you've transitioned to fill, you grow fond of it I assure you :). Best!

JeffBezanson commented 6 years ago

We could just pick whether ones uses one or oneunit. ones and zeros should be considered convenience functions, which is ok as long as they have clear meanings in terms of more general functions.

Sacha0 commented 6 years ago

Plan per triage: Revisit next week, specifically considering moving ones and zeros to the MATLAB compatibility layer, in favor of fill in base. Best!

andyferris commented 6 years ago

ones and oneunits would distinguish the two options.

Not sure how I feel about moving this and zeros to MATLAB compatibility. Just because Mathworks invented it doesn’t mean it’s not a good convenience API that we can be proud of. Just briefly - what were the reasons behind this thought? (Sorry, I have to say that triage seems much more efficient, but significantly less transparent when the reasoning isn’t given).

timholy commented 6 years ago

It's not a change I'm thrilled about either, but I raised it because it's a logical inconsistency. I think it's fair to say that ones(T, sz) implies fill(one(T), sz) but really what it does under the hood is fill(oneunit(T), sz).

One option would be to rename it to oneunits. An alternative would be to do a horrific swap: one -> algebraic_one and oneunit->one. This is breaking and not possible to do via deprecation, which is why I say "horrific."

andyferris commented 6 years ago

Yes, I would say adding oneunits and changing ones to give fill(one(T), ...) would be the obvious fix for that, no?

timholy commented 6 years ago

I'd be fine with that. Out of curiosity, what are the uses of fill(one(T), sz)? Do we even need ones at all?

andyferris commented 6 years ago

Haha :) I was going to ask - what do you use fill(oneunits(T), sz) for? (E.g. What would an array filled with 1 metre, or 1 kilogram, or 1 second, be used for?)

I think fill(one(T), sz) is used for much the same reason as zeros(T, sz), which is to initialize an array for a custom reduction-type-of operation. An array z = zeros(T, sz) is ready to have it's elements added to with z[i] += x whereas as o = fill(one(T), sz) is ready to have it's elements multiplied o[i] *= x. For example, I'm thinking of situations where the array elements might represent (relative) probability. In these two cases, I'm looking for the identity for the + or * operator respectively (and not additive generators).

fredrikekre commented 6 years ago

See also #23544

ChrisRackauckas commented 6 years ago

Haha :) I was going to ask - what do you use fill(oneunits(T), sz) for? (E.g. What would an array filled with 1 metre, or 1 kilogram, or 1 second, be used for?)

The dependent variables for an ODE. It would be weird if it just randomly chopped of units when you asked for an array of type T?

Sacha0 commented 6 years ago

The fundamental reason to prefer fill is that a tight set of powerful, orthogonal, composable tools serves better than a larger collection of ad hoc, limited, and overlapping tools such as ones/zeros/trues/falses. The discussion above highlights this point: While fill accommodates all use cases above unambiguously and (in most real use) concisely, the ones/zeros/trues/falses approach requires a new function for each use case.

A couple relevant examples from rewrites in base:

complex.(ones(size(Ac, 1)), ones(size(Ac, 1)))

becomes

fill(1.0 + 1.0im, size(Ac, 1))

and

2ones(Int, 2, 2, 2)

becomes

fill(2, (2, 2, 2)) # or fill(2, 2, 2, 2) if you prefer

Please note that these fill incantations are simpler, more readable, more compact, and more efficient than their non-fill counterparts, and base/ and test/ are rife with examples like these. As with all change, transitioning to fill requires some initial mental adjustment. But post-adjustment you find you have more power and elegance at your fingertips :). Best!

carlobaldassi commented 6 years ago

@Sacha0: trues/falses are not directly replaceable by fill, but need to use fill! with a BitArray initializer. They are also not included in the ambiguity between one and oneunit. Therefore I don't think that they fit in this discussion at all.

As for ones, I'm generally opposed to deprecating it anyway, I don't see the benefit. The argument that some expression can be written more effectively with fill is not very compelling in my opinion, as all these examples use ones as intermediate steps to do something else; but what about when you actually want an array of ones? Then having to use fill is longer, less obvious and just more annoying. I like @JeffBezanson's proposal best. If parts of base or test can be rewritten more effectively using fill instead of ones there's nothing preventing that right now.

TotalVerb commented 6 years ago

@carlobaldassi While that is true, a quick GitHub search reveals that almost all uses of ones should really be using fill to avoid intermediate allocations...

stevengj commented 6 years ago

Using fill for these cases makes sense to me. Note that we will want a new fill(x, A) = fill!(x, copymutable(A)) method to replace the corresponding methods of ones etc.

carlobaldassi commented 6 years ago

@TotalVerb From a quick skim at the first 10 pages of that search, I wouldn't say that "almost all" is a fair assessment. A "significant fraction", at best. There are lots and lots of legitimate use cases for ones there, perhaps even the majority (and even if they were just say 20% I think my argument still stands).

(I also actually have reservations about the readability argument, I think it's questionable to claim that e.g. fill(2, 2, 2, 2) is more readable than 2 * ones(Int, 2, 2, 2), or that, say, fill(k * 1.0, 3) would be more readable than k * ones(3); I'm not at all convinced that it's merely a matter of habit. This is a minor point though.)

Sacha0 commented 6 years ago

trues/falses are not directly replaceable by fill, but need to use fill! with a BitArray initializer. They are also not included in the ambiguity between one and oneunit. Therefore I don't think that they fit in this discussion at all.

Indeed, trues and falses are not directly replaceable by fill :). Rather, trues and falses are additional instances of the general problem described above / related to #11557 (and that the recent direction of array constructors will hopefully tackle). Other examples include the existence of e.g. bones,bzeros, brand, brandn, and beye in BandedMatrices.jl and equivalents with a d prefix in DistributedArrays.jl.

As for ones, I'm generally opposed to deprecating it anyway, I don't see the benefit. The argument that some expression can be written more effectively with fill is not very compelling in my opinion, as all these examples use ones as intermediate steps to do something else

Having just rewritten a few hundred uses of ones in base, I can affirm @TotalVerb's statement

a quick GitHub search reveals that almost all uses of ones should really be using fill to avoid intermediate allocations...

(Edit: Though I would say roughly half rather than almost all, and appropriate rewrites may be something other than fill.) Moreover, that rewrite experience has taught me...

but what about when you actually want an array of ones? Then having to use fill is longer, less obvious and just more annoying.

... that on the other hand, fill is frequently shorter and simpler in that case: The requested ones are frequently not Float64 (instead e.g. ones(Int, n...) and ones(Complex{Float64}, n...)), in which case fill is shorter and simpler by admitting a literal (e.g. fill(1, n...) and fill(1.0 + 0im, n...)). In measurable terms, the branch in which I have been rewriting ones calls in base is ~5% shorter by character count from ones -> fill rewrites. Best!

Sacha0 commented 6 years ago

To get an objective sense of how ones appears in the wild, I collected all ones calls appearing in the first ten pages of a GitHub search for ones in Julia code, rewrote each such call as appropriate and classified the corresponding change (see this gist), and then reduced the classification data to the following summary:

The analysis included 156 ones calls. Of those calls,

The remaining calls were semantically reasonable as ones, though often ones sees use merely because it's short, rather than because ones specifically are necessary. Of those remaining calls,

Overall, in the wild ~60% of ones calls are better written somehow else, ~8% are reasonably semantically ones and slightly shorter as fill, and ~32% are reasonably semantically ones and slightly longer as fill.

An additional observation:

I encountered only one instance of a ones call accepting an array argument, and it wasn't clear whether the enclosing snippet was real code. So the ones methods accepting an array argument see little if any use in the wild.

gabrielgellner commented 6 years ago

Really interesting discussion ... went from being on the against side to the for side ... Also as another language precedent, R uses rep and matrix in a manner that is equivalent to the fill (just corresponding to the 1d and 2d cases) and you get used to it very quickly -- even though I came from a zeros/ones world.

andyferris commented 6 years ago

Wow, thank you @Sacha0 for putting in that effort!

The question naturally arises as "what about zeros"? I'm guessing there will be significantly more usage, and a few more categories of use (including things like "I simply don't trust uninitialised arrays" or "I don't know how to use the Array constructors").

For whatever reason (I guess it's the symmetry of one and zero), I'm somewhat attracted towards replacing either both ones and zeros with fill, or neither.

StefanKarpinski commented 6 years ago

The thing with zeros is that you would seem to be in one of these situations:

  1. You need to overwrite most of the zeros – in this case you're better off using a comprehension;
  2. You don't need to replace most of the zeros – in this case you're better off using a sparse matrix;
  3. You actually need an all-zero matrix – in this case you're better off using 0I.

There's really no use case where actually allocating a matrix of dense zeros is actually a good idea.

vtjnash commented 6 years ago

That’s perhaps true of linear algebra. It’s not uncommon to need a zero-initialized collection in my work on compilers and other data structures. Maybe they are often sparse, but it’s not worth the performance impact to represent them compactly.

StefanKarpinski commented 6 years ago

Fair enough – sometimes you don't care about density and the simplicity is worth it.

StefanKarpinski commented 6 years ago

Triage: resolved that we keep only the completely non-generic methods i.e. zeros(dims...) and ones(dims...) and maybe zeros(dims) and ones(dims) as well.

gabrielgellner commented 6 years ago

@StefanKarpinski for usage recommendations does that mean we would recommend zeros(3, 3) over fill(0.0, 3, 3) for normal code (when a dense array is wanted etc etc)? Some of the details of efficiency etc is out of my reach, I am just thinking of how I would teach best/idiomatic practices in julia moving forward.

rfourquet commented 6 years ago

This decision seems very surprising to me, it's not so common in base to specifically prevent genericity. What is the reasoning behind? is it that this function comes from matlab where it is not generic (and works only with floats)?

andyferris commented 6 years ago

Triage: resolved that we keep only the completely non-generic methods

What is the reasoning behind?

Further to this, is this issue somehow related to the general problem of constructing arrays that seems to be being considered right now, and if so, how does this help? Or, are we trying to reduce the number of exported methods from Base? (Or something else entirely?)

Sacha0 commented 6 years ago

Writeup forthcoming in https://github.com/JuliaLang/julia/issues/24595 :). Best!

Sacha0 commented 6 years ago

24595's OP detailed this issue's broader context, and now a followup in #24595 specifically addresses convenience constructors ones, zeros, and fill in depth. Reading the former is valuable in appreciating the latter. Best!

carlobaldassi commented 6 years ago

Well, saving at least the Float64 case is better than nothing. I believe that the case of integer zeros is also quite relevant, which -- I assume -- is basically the point that @vtjnash was making here.

It should also be noted that zeros does not have the "problem" of allowing the 3 * ones(n) anti-pattern. In fact, I don't really see why ones and zeros should go together, except in the broad sense of being convenience constructors. There is no real 'symmetry' between those two.

A couple of additional comments on the statistical analysis, since it seems to be the basis of the following discussions and the writeup of #24595. First, ten pages are not really enough for fine-grained conclusions of what's going on in the wild, they can give a rough idea at best. Some files came straight from matlab, for example, as it's clear from their name/style. Secondly, as I suspected, even that analysis shows that roughly half of the uses of ones in there were "legitimate". Thirdly, looking at code like this doesn't tell anything about when writing 3 * ones(...) is really an anti-pattern that creates performance issues, or it's a piece of code that has no performance implications at all (and the writer may have decided that it's just more readable written that way -- which I strongly feel is nobody else's business to decide otherwise, in that case).

Relatedly to the last point, and I think more importantly, what you get to see from a github search will never take into account what's going on in the REPL of people doing exploratory/preliminary work in Julia. Which is exactly where convenience functions come in most handy, and taking them away for no discernible reason is correspondingly most annoying. My point being, having a consistent set of orthogonal primitives that allow to write generic and efficient code is a great goal, and the effort to get there is truly commendable; it's just that not all code is supposed to be beautiful, generic, composable library code. Just my two cents.

Sacha0 commented 6 years ago

Regarding

I believe that the case of integer zeros is also quite relevant, which -- I assume -- is basically the point that @vtjnash was making here.

which refers to

It’s not uncommon to need a zero-initialized collection in my work on compilers and other data structures. Maybe they are often sparse, but it’s not worth the performance impact to represent them compactly.

Note that fill serves as well or better in that case: fill(0, shape...) versus zeros(Int, shape...).

Concerning your other points, #24595 may be worth reading :). Best!

carlobaldassi commented 6 years ago

I simply find that zeros(Int, 10, 10) is more readable/explicit than fill(0, 10, 10), and zeros(T, k) is better than fill(zero(T), k). Why can't we just have both? I don't buy the argument that zeros suffers the same ambiguity problem as ones.

Concerning your other points, #24595 may be worth reading

I had read it. (I even linked it.)

Sacha0 commented 6 years ago

I had read it. (I even linked it.)

Having read #24595 in full and given it due consideration, you are then aware that #24595: (1) concerns a much broader issue of which convenience constructors are only a part; and (2) considers far more than just the statistical analysis posted above and the points you focus on here.

I appreciate that you feel strongly about ones and zeros; your sentiment has come through loud and clear :). As such, chances are our bandwidth would be better spent pushing other fronts forward than continuing this conversation in its present form. Best!

ChrisRackauckas commented 6 years ago

Is there a generic fill incoming along with the zeros change? zeros had a very legitimate use for generic programming since it's much safer than similar, so all of DiffEq, and then I know recently Optim and NLsolve moved from allocating with similar to zeros since knowing everything is allocated to have zeros in it stops a lot of bugs. However, now it seems there's going to be no method for:

zeros(X)

anymore, other than:

similar(X); fill!(X,0)

because the current fill only builds Arrays and thus doesn't type-match like similar or zeros does. I know that some people misused zeros to allocate when they shouldn't have, but allocating with zeros is a very reasonable thing to do in many cases. I hope that a fill(0,X) shorthand gets added to fill this void.

Sacha0 commented 6 years ago

Much thanks for the thoughtful post Chris! :) As an interim shorthand replacement, does zero(X) do the trick?

Note that such use cases are precisely where the ambiguity in zeros and ones can be problematic: Does zeros(X) yield an object with eltype(X) and filled with eltype(X)'s additive identity (i.e. fill!(similar(X), 0)), or instead an object filled with a multiplicative zero for eltype(X) (possibly not of eltype(X))? (For expansion, see #24595.)

The fill(0, X) concept sees a bit of discussion in #11557, and I agree that might be a useful generalization of fill. Thanks and best!

timholy commented 6 years ago

The other problem is that arrays with unconventional indices might want to be created with something like zeros(inds...) (because the index type determines the array type). But for a 1-d case, is X "the array you want to be similar to" or "the indices for the desired array"? (After all, AbstractUnitRange <: AbstractArray.) Concretely, does zeros(3:5) mean fill!(similar(3:5), 0) or fill!(OffsetArray(Vector{Float64}(3), 3:5), 0)?

Sacha0 commented 6 years ago

Linking https://github.com/JuliaLang/julia/pull/24656, which contains additional discussion of {ones|zeros }(A::AbstractArray, ...) convenience constructors. Best!

ChrisRackauckas commented 6 years ago

I am surprised it's considered odd to use zeros to make cache variables according to #24656. I would think that, if zeros was reduced to close to zero overhead, almost any case where people are using similar should instead be zeros since that tends to fix quite a few bugs. I think we should encourage more people to do this since similar can be quite unsafe, and not having a function and instead pulling together fill! + similar makes it less obvious that is what people should be doing. Here's a comment about this popping up in Optim.jl:

https://github.com/JuliaNLSolvers/NLsolve.jl/issues/89#issuecomment-294585960

However, I do agree with @timholy that it is non-obvious how it should be interpreted. Let me point you to a really non-simple example in DiffEq.

https://github.com/JuliaDiffEq/MultiScaleArrays.jl

MultiScaleArrays.jl created abstract arrays which are recursive graph-like structures which can be used in the diffeq solvers (and I think it may be compatible with Optim.jl and NLsolve.jl now?). It's a nice convenience for biological models among other things. When we throw it into the ODE solver there's a question: what should we make the cache arrays? In some cases it matters that the user gets back the array they wanted since it will show up in their ODE function f(t,u,du) and they will want to treat it accordingly. However, in other cases it only shows up as something that is internally broadcasted against. So there's two different types of cache variables.

To handle this, take a look at the cache for one of the algorithms:

https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/caches/low_order_rk_caches.jl#L224-L234

Here, rate_prototype = similar(u,first(u)/t,indices(u) is similar but with potentially different eltype for units. But notice that there's two separate types here: similar(u) vs similar(u,indices(u)). I have interpreted those to mean "match the type and the shape" vs "match the shape and the eltype, but doesn't need to be the same type". So for an AbstractMultiScaleArray, the first will create another AbstractMultiScaleArray while the other, for speed since it isn't seen by the user, will just create an Array of the appropriate size. This are then extended to similar(u,T) and similar(u,T,indices(u)).

Maybe that's just punning off of what already exists, but I think this is an important distinction. When doing generic programming, you have two separate caches: user-facing caches that you want to match types to match their expectation, and internal caches which are just used by the algorithm and you want as much speed as possible.

Notice that these are using similar because I was too lazy to come up with a zeros version of it. I actually have a separate place that can zero out some of these arrays because if the user only sets du in their f(t,u,du) derivative calculation, they tend to implicitly thing "what I don't set means zero", which only is true when it was allocated with zeros, so I try to pre-allocate using zeros as much as possible (the same issue comes up in NLsolve.jl for this).

Hopefully that explanation wasn't too confusing to follow. In all of my cases I can just switch to similar followed by fill!, but I know some packages won't and that will be a source of bugs.

timholy commented 6 years ago

Interesting. You're right that similar(A, inds) might create a different type than similar(A), but in general I've always thought of it as likely creating the same type but with different indices. For example, if you needed a 1-dimensional cache for a columnwise operation on a 2-d object, I would use similar(A, first(inds)). (Of course it is a different type because the dimensionality is a type parameter, but it might be the same abstract container type.) You could also use it to create a 5x5 cache of a small tile, etc.

Overall this seems to be a challenging problem. It's a little late in the game, but should we introduce same? It could have the same arguments as similar, but the contract would be that it would be required to return the same abstract container.

andyferris commented 6 years ago

I could support a one-argument form of same, but even this is tricky - note that same(a) couldn't return the same array type if a doesn't support setindex! because same and similar are only useful if you are going to write to the array afterwards. We could make this an error for immutable a, but as a interface for AbstractArray this seems unnecessary (and maybe unhelpful) for making correct, generic code.

Similarly, we can't assume every AbstractArray can support different eltypes or indices. To me, having a two- or three-argument form of same would just introduce run-time errors in a bunch of places while giving people a false sense of security that their generic code will work well for any AbstractArray input, when it does not.

But for a 1-d case, is X "the array you want to be similar to" or "the indices for the desired array"?

This is another reason I am in favour of keys returning a container with identical indices and values, and then making this a requirement of similar (unless you provide one (some) integer(s) in which case Base.OneTo (CartesianRange) is assumed).

Sacha0 commented 6 years ago

This discussion is veering towards #18161, and perhaps should continue there :).

JeffBezanson commented 6 years ago

Most recent triage was leaning towards just keeping ones.

damiendr commented 6 years ago

Does it hurt to keep them? I think it helps people who come from numpy feel at home in Julia.

ViralBShah commented 6 years ago

Closing since we are keeping ones and zeros.