JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.87k stars 5.49k forks source link

design of array constructors #24595

Open Sacha0 opened 7 years ago

Sacha0 commented 7 years ago

Intro

This post is an attempt to consolidate/review/analyze the several ongoing, disjoint conversations on array construction. The common heart of these discussions is that the existing patchwork of array construction methods leaves something to be desired. A more coherent model that is pleasant to use, adequately general and flexible, largely consistent both within and across array types, and consists of orthogonal, composable parts would be fantastic.

Some particular design objectives / razors might include:

(*Common operations include constructing: (1) an array uniformly initialized from a value; (2) an array filled from an iterable, or from a similar object defining the array's contents such as I; (3) one array from another; and (4) an uninitialized array.)

Via a tour of the relevant issues and with the above in mind, let's explore the design space.

Tour of the issues

Let's start with...

The prevailing idea for fixing the preceding issue is to: (1) deprecate uninitialized-array constructors that accept (solely) a tuple or series of integer arguments as shape, removing the method signature collision; and (2) replace those uninitialized-array constructors with something else. Two broad replacement proposals exist:

  1. Introduce a new generic function, say (*modulo spelling) blah(T, shape...) for type T and tuple or series of integers shape, that returns an uninitialized Array with element type T and shape shape. This approach is an extension of the existing collection of Array convenience constructors inherited from other languages including ones, zeros, eye, rand, and randn.

(* Please note that blah is merely a short placeholder for whatever name comes out of the relevant ongoing bikeshed. The eventual name is not important here :).)

  1. Introduce Array{T}(blah, shape...) constructors where blah signals that the caller does not care what the return's contents are. These constructors would be specific instances of a more general model that extends and unifies the existing constructor model. That more general model is discussed further below.

The first proposal

The first proposal leads us to...

The de facto approach is introduction of ad hoc perturbations on these function names for each new array type: Devise an obscure prefix associated with your array type, and introduce *ones, *zeros, *eye, *rand, *randn, and hypothetically *blah functions with * your prefix. This approach fails all three razors above: Failing the first razor, to construct an instance of an array type that follows this approach, you have to discover that the array type takes this approach, figure out the associated prefix, and then hope the methods you find do what you expect. Failing the second razor, when you encounter the unfamiliar bones function in code, you might guess that function either carries out spooky divination rituals, or constructs a b full of ones (whatever b refers to). Along similar lines, does spones populate all entries in a sparse matrix with ones, or only some set of stored/nonzero entries (and if so which)? Failing the third razor, the very nature of this approach is proliferation of ad hoc convenience functions and is itself an antipattern. On the other hand, this approach's upside is that it sometimes involves a bit less typing (though often also not, see below). Nonetheless, this approach is fraught.

So what's the other approach? JuliaLang/julia#11557 started off by discussing that other approach: ones, zeros, eye, rand, and randn typically accept a result element type as either first or second argument, for example ones(Int, (3, 3)) and rand(MersenneTwister(), Int, (3, 3)). That argument could instead be an array type, for example ones(MyArray{Int}, (3, 3)) and rand(MersenneTwister(), MyArray{Int}, (3, 3)). This approach is enormously better than the last: It could mostly pass the first and second razors above. But it nonetheless fails the third razor, and exhibits other shortcomings (mostly inherited from the existing convenience constructors). Let's look at some of those shortcomings:

Each of these shortcomings is perhaps acceptable considered in isolation. But considering these shortcomings simultaneously, this approach becomes a shaky foundation on which to build a significant component of the language.

In part motivated by these and other considerations, JuliaLang/julia#11557 and concurrent discussion turned to...

The second proposal

... which is to introduce (modulo spelling of blah, please see above) Array{T}(blah, shape...) constructors, where blah indicates the caller does not care what the return's contents are. These constructors immediately generalize to arbitrary array types as in MyArray{T}(blah, shape_etc...), and would be a specific instance of a more general model that extends the existing constructor model:

The existing constructor model allows you to write, for example, Vector(x) for x any of 1:4, Base.OneTo(4), or [1, 2, 3, 4] (to construct the Vector{Int} [1, 2, 3, 4]), or similarly SparseVector(x) (to build the equivalent SparseVector). To the limited degree this presently works broadly, the model is MyArray[{...}](contentspec) where contentspec, for example some other array, iterable, or similar object, defines the resulting array's contents.

The more general extension of this model is MyArray[{...}](contentspec[, modifierspec...]). Roughly, contentspec defines the result's contents, while modifierspec... (if given) provides qualifications, e.g. shape.

What does this look like in practice?

For the most part you would use constructors as you do now, with few exceptions. Let's go through the common construction operations mentioned above:

  1. (Constructing uninitialized arrays.) To build an uninitialized MyArray{T}, where now you write e.g. MyArray{T}(shape...), instead you would write MyArray{T}(blah, shape...). (#24400 explored this possibity for Arrays, and inevitably became a bikeshed of the spelling of blah :).)

  2. (Constructing one array from another.) Constructing one array from another, as in e.g. Vector(x) or SparseVector(x) for x being [1, 2, 3, 4], would work just as before.

  3. (Constructing an array filled from an iterable, or from a similar object defining the array's contents such as I.) What is possible now, for example Vector(x) for x either 1:4 or Base.One(4), would work as before. But where e.g. Array[{T,N}](tuple) now fails or produces an uninitialized array depending on T, N, and tuple, such signatures could work as for any other iterable. And additional possibilities become natural: Constructing Arrays from HasShape generators is one nice example. Another, already on master (#24372), is Matrix[{T}](I, m, n) (alternatively Matrix[{T}](I, (m, n))), which constructs a Matrix[{T}] of shape (m, n) containing the identity, and is equivalent to eye([T, ]m[, n]) with fewer ambiguities.

Great so far. Now what about perhaps the most common operation, i.e. constructing an array uniformly initialized from a value? Under the general model above, this operation should of course roughly be MyArray[{T}](it, shape...) where it is an iterable repeating the desired value. But this incantation should: (a) be fairly short and pleasant to type, lest ad hoc constructors for particular array types and values proliferate to avoid using the general model; and ideally (b) mesh naturally with convenience constructors for Arrays.

Triage came up with two broad spelling possibilities. The first spelling possibility led to...

The second spelling possibility is MyArray(Rep(v), shape...) modulo spelling of Rep(v), where Rep(v) is some convenient alias for Iterators.Repeated(v) with v any desired value. (Another possible spelling of Rep(v) discussed in triage is Fill(v), which dovetails beautifully with the fill convenience constructor for the same purpose specific to Arrays. Independent of the iterator's name, this spelling is a clean generalization of fill from Arrays to arrays generally.) In practice this would look like MyArray(Rep(1), shape...) (instead of MyArray{Int}(ones, shape...)). This spelling possesses some distinct advantages:

Great. With this latter spelling, overall this second proposal appears to satisfy both the broad design objectives and three razors at the top, and avoids the shortcomings of the first proposal.

What else? Convenience constructors

Convenience constructor are an important part of this discussion and about which there is much to consider. But that topic I will leave for another post. Thanks for reading! :)

rfourquet commented 7 years ago

One point concerning rand to take into account is the "value specification", which is often encoded with a value (e.g. 1:10) instead of a type. FWIW, independantly of your effort and other's to solve this problem, I have been thinking on how to solve the more narrow rand's problem (and get rid of sprand!)... which is at the same time not-so-simple on its own as we (at least I) want allow also generating Sets or Dicts (which require specifying the value specification for a Pair...). I didn't plan to speak about it before PR'ing it, but thought I should mention it here.

Sacha0 commented 7 years ago

I much look forward to your thoughts on rand, @rfourquet! :)

Sacha0 commented 7 years ago

Convenience constructors: eye

A brief review of triage's discussions of eye.

Shortcomings

  1. A Matrix{T} is usually a poor representation of the identity operator/matrix: eye(T, m, n) requires O(mn) storage (where in principle constant storage should do). eye(T, m, n) + ... usually involves O(mn) operations (where in principle O(min(m, n)) operations should do). eye(T, m, n) * ... usually involves O(mnk) operations (where in principle either constant or O(nk) operations should do). Base includes better representations of the identity operator/matrix for a variety of purposes, including UniformScalings, Diagonals, and SparseMatrixCSCs; these representations should be preferred, but eye often boxes these representations out in practice.

  2. Where a Matrix{T} is a reasonable representation of the identity operator/matrix, eye provides no way to efficiently construct a scaled identity operator/matrix: One must write, e.g., 2*eye(T, m, n), which involves a temporary and O(mn) unnecessary operations. (Promotion in such expressions can also require a bit of care.)

  3. Whether eye(T, m, n)'s element type should be T's multiplicative identity one(T) or additive generator oneunit(T) depends on context: Interpreting eye(T, m, n) as a concrete representation of the identity operator/matrix for use in multiplication, T's multiplicative identity one(T) seems the right choice. But where eye(T, m, n) appears (frequently) in addition, T's additive generator might instead be the right choice. Standardizing on one or the other is of course possible, but: (1) each choice fails to satisfy some use cases; and (2) choosing one(T), eye(T, m, n)'s result element type cannot be T where T's multiplicative identity one(T) is not of type T.

  4. As in the OP, eye does not generalize well to non-Array array types.

  5. eye is a terrible pun.

Resolution

  1. Following the second proposal in the OP, introduce array constructors that accept a UniformScaling first argument and shape-specifying trailing arguments, for example (Array|Matrix)[{T}](S::UniformScaling, shape...) for Arrays. Then Matrix(2I, m, n), for example, serves as a direct replacement for 2*eye(Int, m, n). Such constructors avoid shortcomings 2-5.

  2. Deprecate eye in favor of UniformScalings and UniformScaling-accepting constructors (hopefully addressing shortcoming 1 by encouraging better representation choices / making the usually poor Matrix{T} choice less attractive).

Best!

Sacha0 commented 7 years ago

Convenience constructors: ones, zeros, and fill

ones, zeros, and fill each construct Arrays filled with some value. This post consolidates/reviews/analyzes discussion of this group of convenience constructors from github, slack, and triage.

Shortcomings of ones and zeros

  1. ones element type ambiguity: As JuliaLang/LinearAlgebra.jl#484 highlights, whether ones(T, shape...) returns an Array filled with T's multiplicative identity (one(T)) or additive generator (oneunit(T)) is ambiguous. Standardizing on one or the other is of course possible, but: (1) as JuliaLang/LinearAlgebra.jl#484 demonstrates, each choice violates some subset of users' expectations and fails to satisfy some use cases; and (2) choosing one(T), one(T, shape...)'s element type cannot be T where T's multiplicative identity one(T) is not of type T. Introducing a oneunits(T, shape...) function (that constructs Arrays of oneunit(T)s) and standardizing ones(T, shape...) to construct Arrays of one(T)s addresses point (1) but not (2), and the proliferation of (value-)names strongly indicates a missing abstraction.

  2. zeros element type ambiguity: Prior to JuliaLang/julia#16116, whether one(T) returned a multiplicative identity or additive generator for T was ambiguous. JuliaLang/julia#20268 resolved this ambiguity by introducing oneunit(T) as the additive generator for T and affirming one(T) as a multiplicative identity. zero suffers from a similar issue, though likely less important in practice: Is zero(T) the additive identity or a sort of multiplicative zero for T? To illustrate, is 3meters * zero(1meters) 0meters^2 or 0meters? Consequently, even with disambiguation of zero, zeros suffers from an ambiguity analogous to that described above for ones: Does zeros(T, shape...) return an Array filled with T's additive identity or instead some sort of multiplicative zero? As with ones, standardizing on one or the other is of course possible, but: (1) each choice will violate some subset of users' expectations and fail to satisfy some use cases; and (2) choosing a multiplicative zero for T, zeros(T, shape...)'s element type cannot be T where that multiplicative zero for T is not of type T. Bifurcating zeros into two functions, one for each choice, addresses point (1) but not (2), and the proliferation of (value-)names strongly indicates a missing abstraction.

  3. Handling values without an associated function: How do you construct an Array filled with 2s, or 1.0 + ims, or in general any value other than a zero or one? If you are used to ones/zeros, perhaps you respectively call 2ones(Int, (m, n)) and complex.(ones(m, n), ones(m, n)). (These examples are taken from base, as noted in https://github.com/JuliaLang/LinearAlgebra.jl/issues/484.) Or to avoid the temporary, perhaps you respectively call fill!([ones|zeros](Int, (m, n)), 2) and fill!([ones|zeros](Complex{Float64}, (m, n)), 1.0 + im). But such incantations are less pleasant than ones and zeros, so perhaps you give your common values names: twos(Int, (m, n)) etc. Overall, antipatterns emerge and ad hoc functions proliferate. As demonstrated in https://github.com/JuliaLang/LinearAlgebra.jl/issues/484 and discussed elsewhere, this issue bears out in practice and is widespread: more than half of ones calls in the wild are ad hoc fills.

  4. Simultaneous generalization pressure and poor generalizability: As the OP lays out, ones/zeros do not generalize well to other array types. But ones/zeros's considerable mindshare from Arrays creates pressure for generalization to other array types, in practice yielding the poor de facto generalization described at length in the OP (bones/bzeros, dones/dzeros, spones/spzeros, etc).

  5. Other questionable use of ones: Likely by nature of its considerable mindshare and being the shortest Array constructor, ones sees use where it shouldn't (beyond simple ad hoc fills). A few examples from base: dc = d + im*convert(Vector{elty}, ones(n)) instead of e.g. dc = d .+ elty(1)im or dc = d .+ Complex{elty}(im) (ad hoc convert and broadcast). https://github.com/JuliaLang/julia/blob/8085047696e2bde1d36158d243a23451285f6dbc/test/linalg/diagonal.jl#L254 (1:size(A,1)).*ones(Int,size(A,2))' instead of repmat(1:size(A,1), 1, size(A,2)) (ad hoc repmat/repeat). https://github.com/JuliaLang/julia/blob/8085047696e2bde1d36158d243a23451285f6dbc/test/arrayops.jl#L1857 isequal(Array(sparse(complex.(ones(5,5), ones(5,5)))), complex.(ones(5,5), ones(5,5))) where introducing a name is better as e.g. in A = fill(1.0+im, 5, 5); isequal(Array(sparse(A)), A). https://github.com/JuliaLang/julia/blob/8085047696e2bde1d36158d243a23451285f6dbc/test/sparse/sparse.jl#L27 ones(2,3) * ones(2,3)' instead of fill(3., 2, 2) (perhaps an "unusual ad hoc fill"). https://github.com/JuliaLang/julia/blob/8085047696e2bde1d36158d243a23451285f6dbc/test/arrayops.jl#L1204

  6. Questionable use of zeros: While zeros certainly has legitimate uses, as @StefanKarpinski argues in https://github.com/JuliaLang/LinearAlgebra.jl/issues/484, due to its mindshare and brevity zeros sees use where something else would serve better.

  1. Non-orthogonality / overlap with ...

fill

Subsuming ones and zeros, and obviating the antipatterns and ad hoc methods described in shortcoming 3, fill is the missing abstraction mentioned explicitly in shortcomings 1-2 and implicitly in shortcoming 3. Moreover, fill possesses some distinct advantages over ones and zeros:

fill's primary perceived downside is length. Though for constructing Arrays filled with Float64(0) or Float64(1) fill is marginally longer than ones/zeros (e.g. fill(1., shape...) versus ones(shape...)), for other types admitting literals fill is usually shorter (e.g. fill(1f0, shape...) versus ones(Float32, shape...)). For types not admitting literals, fill and zeros/ones are comparable in length (e.g. fill(Float16(0), shape...) versus zeros(Float16, shape...)), with a slight edge to zeros/ones. But the edge of course goes to fill for values other than a zero or one.

At present ones/zeros seem to box fill out of mindshare in the wild.

Resolution Options

  1. Better specify ones/zeros. See challenges/pitfalls described above in shortcomings 1-2. Fails to address any of shortcomings 3-7.

  2. Move ones/zeros to MATLAB compat. Addresses shortcomings 1-4 and 7. Addresses shortcomings 5-6 insofar as those shortcomings do not or only partially shift to fill.

  3. Deprecate ones/zeros methods other than ones(shape...) and zeros(shape...). Reasoning: For Float64, the additive generator and multiplicative identity coincide, and the additive identity and multiplicative zero also coincide. So these methods do not suffer from shortcomings 1-2. Though shortcomings 3-7 remain, by reducing ones/zeros's scope: (a) awareness and use of fill may improve, perhaps mitigating shortcoming 3; (b) ones/zeros generalization pressure may decrease, perhaps mitigating shortcoming 4; and (c) ones/zeros may become less attractive for questionable use, perhaps mitigating shortcomings 5-6.

What about methods of ones/zeros accepting an array (instance, not type) first argument?

Triage (uniformly?) found these methods questionable. And the analysis in https://github.com/JuliaLang/LinearAlgebra.jl/issues/484 and similar analysis of base suggests these methods rarely appear in the wild. So independent of what happens with the more common ones/zeros methods, triage favors deprecating these methods.

Thanks for reading! :)

andyferris commented 7 years ago

First, thank you very much @Sacha0 for writing up such complete and well considered notes here - extremely useful, as it is warranted by the complexity.

Overall, I agree with the thrust and the razors and so-on (very handy set-up in the OP). I have a couple of orthogonal thoughts so I'll split them up into separate posts:

First, regarding convenience functions like ones and zeros and rand and fill:

rfourquet commented 7 years ago

what about rand(1:10, size...) -> fill(Rand(1:10), size...)

But this would create an array containing (aliases of) only one Rand(1:10) object, unless fill handles Rand specially, which I would dislike. I have often wanted an equivalent of C++'s std::generate, which fills a container by invoking a function, and started implementing a similar functionality in julia, until I realized that a mutating version of that can be achieved by broadcast (a=zeros(10); a .= rand.()), and a non-mutating version via Generator or list comprehensions: [rand(1:10) for x=..., y=...]. Unfortunately the latter is less performant (than rand(1:10, size...)), because it doesn't factor out a costly computation which happens within rand(1:10), so having an Iterators.randstream([rgn], [dims]) would make sense. But it's not immediately trivial to supress the rand(1:10, size...) API while retaining performance.

andyferris commented 7 years ago

It seems to me that some of the issues facing array constructors are also shared by similar, which I have been thinking about an awful lot lately. They share an ambiguity where the size might be confused for the keys or values of the output container.

I'll start with the signature similar(array, T, size...). It seems to me that size... is actually a placeholder for the indices and keys of the output array. For an offset vector, you would probably be best to specify the indices rather than just the size - such as similar(array, T, 0:(n-1)) for a zero-based vector of n elements. Thus I would assert that similar(array, T, 3) is just shorthand for similar(array, T, Base.OneTo(3)). For multidimensional arrays perhaps the third argument is just the keys of the array and similar(array, T, indices...) is shorthand for similar(array, T, cartesian_product(indices...)) where cartesian_product returns an AbstractCartesianRange of the indices.... (Going even further than this, I had been suggesting a version of similar for dictionaries such as similar(olddict, ValueType, new_keys) that returns a container with junk values and given keys).

I would also argue that a generic interface for array constructors should be accepting their keys - with indices... and size... simply as shorthand. (Similarly for fill and zeros and rand...) The advantages are that the expected constructor pattern will follow through to offset arrays (and potentially even to associatives). In fact, I'd probably make it one of your razors that the design should naturally work for offset arrays.

andyferris commented 7 years ago

I'm not sure if this is the best forum for brainstorming, but to me when a function argument is ambiguous then I would consider giving it a keyword argument (which hopefully will be type stable soon!). Assuming we want to deal with offset arrays and indices and keys here somewhere, compare the signatures in this gist.

Sacha0 commented 7 years ago

Much thanks for reading and contributing your thoughts! :)

Regarding generalization of rand/randn/randexp

The working plan is to follow the OP's second proposal, i.e. MyArray(randstream, otherspecs...) for randstream any random stream. (The above independent convergence on this idea is happily confidence inspiring :).)

Regarding keyword arguments

Usually being clearer, more explicit, and less ambiguous, keyword arguments are fantastic. I would be delighted to see the OP's second proposal support keyword argument signatures, particularly where positional arguments do not scale well or are ambiguous.

In those simple cases where positional arguments are clear / unambiguous, positional arguments' brevity makes them pleasant to work with (particularly when, for example, hacking at the REPL). The OP's third razor strongly suggests their inclusion in those cases.

Wonderfully, positional and keyword arguments need not be in tension! We can have the best of both worlds by supporting one, the other, or both as appropriate and to best effect. And fortunately we can expand the set of supported signatures at any time, for example in the 1.x series. So working through the details of when/where to support positional and/or keyword arguments is not pressing.

Regarding the nature of trailing arguments to array constructors

Concerning what precisely the trailing arguments to array constructors should be (that is, what otherspecs... in the OP's second proposal's MyArray[{...}](contentspec[, otherspecs...]) model should be), I intentionally left otherspecs... somewhat ambiguous and thus flexible. Working out what otherspecs... might be in general will require substantial time and experimentation I wager. But fortunately, for reasons analogous to those given at the end of the preceding subsection, working out the details on this front is not pressing.

Regarding similar, allocate, storage traits, and related topics

Overhauling similar, introducing allocate, exploring storage traits, and related proposals are important topics warranting extensive consideration and discussion. Such proposals have a long and broad design and implementation horizon. And as much as I look forward to that discussion and its potential fruits, I would like to keep this thread tightly focused on array constructors in the narrow sense, and particularly on associated near-term considerations. Discussion of these other topics perhaps would be best consolidated in JuliaLang/julia#18161.

This thread's focus

With time before 1.0 being scarce, I advocate we keep this thread tightly focused on what must happen by 1.0 to improve / enable future improvement of the design of array constructors in a narrow sense. So far, the relevant consensus action items seem to be:

Edit: More tasks:

Other consensus items? Thanks and best!

dlfivefifty commented 7 years ago

Ref https://github.com/JuliaArrays/FillArrays.jl which was created very recently to allow for constructors like

BandedMatrix(Zeros(5,5), 1,1)
Sacha0 commented 7 years ago

Cheers, that approach is quite similar to the OP's second proposal :). For the reasons given in that proposal, expressing e.g. BandedMatrix(Zeros(5,5), 1,1) as e.g. BandedMatrix(Rep(0), 1,1) or BandedMatrix(Fill(0), 1,1) (for Rep/Fill proposed shorthands for Iterators.Repeated) would be advantageous. Looks like now is the time to select and introduce one or the other shorthand? :) Best!

dlfivefifty commented 7 years ago

For completeness, FillArrays supports both Fill and Zeros:

Matrix(Fill(0.0, 5, 5))
Matrix(Zeros(5, 5))

The reason there are two types is Zeros is special: we know we can always convert a Zeros matrix to a sparse matrix without destroying sparsity, where Fill would default to a dense matrix.

It is important that this happens at compile time not run time: for example, imagine BandedMatrix on a GPU: one would not want to have to add a check F.x == 0 as that would need to be run on the CPU, where in theory GPUBandedMatrix(Zeros(5,5),3,3)) could be done completely on the GPU as determined at compile time.

dlfivefifty commented 7 years ago

PS: I think it would have to be BandedMatrix(Rep(0), (5,5), (1,1)) to indicate the size of the matrix.

Sacha0 commented 7 years ago

Cheers, with those expansions I think I follow now: The latter arguments in BandedMatrix(Zeros(5,5), 1,1) and BandedMatrix(Rep(0), (5,5), (1,1)) specify bandwidth rather than shape? If so, this approach fits well with the OP's second proposal (the Zeros(5,5)/Ones(5,5) being lazy, HasShape representations of the intended contents, and the trailing (1,1) argument being additional specs). Though to avoid eltype ambiguities, requiring eltype specification as in Zeros{T}(5,5)/Ones{T}(5,5) seems necessary? Additionally, constant propagation might be able to handle the compile- versus run-time consideration with Rep, and if not then some explicitly compile-time equivalent of Rep (think along the lines of Val) would do the trick. Best!

dlfivefifty commented 7 years ago

The latter arguments in BandedMatrix(Zeros(5,5), 1,1) and BandedMatrix(Rep(0), (5,5), (1,1)) specify bandwidth rather than shape.

Yes.

Though to avoid eltype ambiguities, requiring eltype specification as in Zeros{T}(5,5)/Ones{T}(5,5) seems necessary?

I followed zeros and default to Zeros{Float64}(5,5). Though this is debatable whether it is ideal behaviour.

Additionally, constant propagation might be able to handle the compile- versus run-time consideration with Rep.

That sounds fine, though it can't be like Val{x} for non-constant x as Fill(x, n, m) should not recompile for different values of x.

Sacha0 commented 7 years ago

That sounds fine, though it can't be like Val{x} for non-constant x as Fill(x, n, m) should not recompile for different values of x.

Could you expand a bit? I was thinking along the lines of BandedMatrix(ExplicitCompileTimeRep(0), shape, bandwidths), with ExplicitCompileTimeRep(0) lifting 0 to the type domain such that you can write, e.g., a BandedMatrix(::ExplicitCompleTimeRep{0}, shape, bandwidths) = ... method (as I imagine you would similarly handle the Zeros{Float64}(0, 0) approach). Best!

dlfivefifty commented 7 years ago

Sure, that's fine if there's both Rep and ExplicitCompileTimeRep. But you could already do that with a ExplicitCompileTimeRep{0} syntax.

There is still a role for FillArrays, which have dimensions and are AbstractArray.

dlfivefifty commented 7 years ago

A reason to have dimensioned Fill(x,n) (in addition to dimensionless Rep(x)) is to support allocation free concatenation:

[1; Zeros(100)]

This should work for free already since Fill <: AbstractArray .

Sacha0 commented 7 years ago

A reason to have dimensioned Fill(x,n) (in addition to dimensionless Rep(x)) is to support allocation free concatenation

Agreed, particularly given the following note from https://github.com/JuliaLang/julia/issues/24595#issuecomment-345416384 :) :

An observation related to [...] the analysis in https://github.com/JuliaLang/LinearAlgebra.jl/issues/484: Of the ~30% of ones calls in the wild that are reasonably semantically ones, an appreciable fraction of such calls appear in concatenations, where some lazy equivalent would be better. A similar statement holds for zeros.

Taking a step back, in BandedMatrix(Rep(v), (5, 5), (1, 1)), why would lifting v to the type domain be necessary? Independent of v you construct a 5x5 BandedMatrix{typeof(v)} with (1,1) bandwidths, and populate the stored entries of that BandedMatrix with v. No branching on (value) v is necessary? Best!

dlfivefifty commented 7 years ago

Hmm, maybe you are right: I was thinking BandedMatrix(Rep(1), (n,m), (l,u)) and BandedMatrix(Fill(1,n,m), (l,u)) should be errors since it can’t convert a dense matrix to a banded one.

Then I remembered that Diagonal, LowerTriangular and Symmetric just ignore the offending values.

Sacha0 commented 7 years ago

To clarify my comment above, for BandedMatrix(Fill(1, shape), bandwidths) either behavior you mention strikes me as reasonable, and for the stricter choice I agree that lifting the fill value into the type domain is useful. Rather, I have been thinking about BandedMatrix(Rep(1), shape, bandwidths) as conceptually distinct from BandedMatrix(Fill(1, shape), bandwidths): For arbitrary shapeless iterator iter, I interpret StructuredMatrixType(iter, shape, bandwidths) as "build a StructuredMatrixType with shape shape and bandwidths bandwidths, populating its mutable slots from iter".

This discussion perhaps suggests that having conceptually distinct Rep and Fill could be great, the former with the immediately preceding interpretation in constructors and the latter interpreted as a general lazy equivalent of fill. Thoughts? Thanks for the great discussion @dlfivefifty! :)

dlfivefifty commented 7 years ago

We have these three versions:

MyMatrix(A, pars)
MyMatrix(I, dims, pars)
MyMatrix(Rep(x), dims, pars)

I think for the first constructor, where A is an AbstractMatrix, this should always be a projection: that is, if A has the same entries as some MyMatrix with parameters pars then MyMatrix(A, pars) should return a MyMatrix with the same entries as A.

I then think there should be consistency between MyMatrix(I, dims, pars) and MyMatrix(eye(dims...), pars), and similarly between fill and Rep .

Sacha0 commented 7 years ago

I think for the first constructor, where A is an AbstractMatrix, this should always be a projection: that is, if A has the same entries as some MyMatrix with parameters pars then MyMatrix(A, pars) should return a MyMatrix with the same entries as A.

Agreed, that seems reasonable.

I then think there should be consistency between MyMatrix(I, dims, pars) and MyMatrix(eye(dims...), pars), and similarly between fill and Rep .

The former seems reasonable, but how do you conclude the latter? Rep(v) is not an AbstractMatrix; Fill(v, shape...) would be. Best!

fredrikekre commented 6 years ago

trues and falses should really be deprecated in favor of BitArray constructors. falses((m,n)) -> BitMatrix(false, (m,n)) isn't so bad, and it is much more clear what is going on, and what you will get back.

Until recently I assumed that falses/trues would give me an Array{Bool}, which is the logical extrapolation from learning what the zeros and ones functions does. To be fair I have never used falses/trues but I think this illustrates the problem with convenience constructors. It is very confusing when different value specific fill methods (ones, zeros, trues, falses) return different things.

fredrikekre commented 6 years ago

Some more morning coffee thoughts: Is there anything that hinders something like this: Array(x, (n...,))/Array{T}(x, (n...,)) which results in a n₁ × n₂ ... Array{typeof(x)}/Array{T} filled with x. It avoids the proposed (somewhat ugly) Rep(x) and should work for immutable x IIUC. Something like Rep(x) could then be used essentially to say "I want a copy of x at each entry of the array".

If this is possible, then we might as well just deprecate fill(x, (n...,)) in favor of Matrix(x, (n...,)) at which point we have reached array constructor paradise.

andyferris commented 6 years ago

I like this. I was just going to suggest we generalise the constructors to a broadcast rule, so that Array{T}(in, size) makes a size-sized Array{T} called out followed by out .= in.

And size could default to size(in) if in has a size and (length(in),) if in has no size but has a length. If it has neither than it needs to be specified.

This seems like a constructor pattern/semantic that works for many inputs (including Generators or simply making a copy of another array) and many types of arrays (including immutable arrays and StaticArrays). (We can generalize the size to índices or keys also).

Ticks a lot of boxes that I’ve been searching for!

dlfivefifty commented 6 years ago

I think it's a bad idea to have it depend on whether x has a size because then it is semantically inconsistent whether one wants to convert x to be an Array{eltype(x)} vs wants to create an Array{typeof(x)} filled with x. I would suggest the following behaviour:

A = rand(3,3)  # or a MyArray
x = 1
Array(A)      # Returns an `Matrix{Float64}` whose entries are specified by `A[k,j]`
Array{T}(A) # Returns an `Array{T}` whose entries are specified by `convert(T, A[k,j])` 
Array(x)      # Errors
Array{T}(x) # Returns a `Vector{T}` of length `x`
Array(A, (n,)) # Returns a `Vector{Matrix{Float64}}` filled with `A`
Array(x, (n,)) # Returns a `Vector{Int}` filled with `A`
Array(x, n)    # Errors
Array{T}(x, n)    # Returns a `x` by `n` `Matrix{T}`

So the only point of possible confusion is Array{T}(::AbstractArray) vs Array{T}(::Int).

carlobaldassi commented 6 years ago

trues and falses should really be deprecated in favor of BitArray constructors.

Probably unsurprisingly, I'm quite strongly against this. There really is no point to doing that. The argument that it is unclear what they return is not tenable, it's clearly documented and if you enter them in the REPL the container type will be spelled out. There's no ambiguity and they pose no issues. Also, BitArrays behave like Array{Bool}s, except for performance, so even if you assumed that trues/falses return an Array{Bool} it's not like it would be particularly troublesome. Also, I use these constructors all the time.

fredrikekre commented 6 years ago

The argument that it is unclear what they return is not tenable, it's clearly documented and if you enter them in the REPL the container type will be spelled out.

That's not the point, I just meant to illustrate that it is confusing that some value-specific fill-methods return Array and some BitArray.

There's no ambiguity and they pose no issues.

IMO they suffer from the same problems as zeros and ones. What if I really want an Array{Bool} or a MyArray{Bool}? Then you have to use something else anyway since falses/trues don't generalize to other container types. Right now the best way to get an Array{Bool} is to use something like fill(true, (n,n)), great, thats not so bad, but that doesn't generalize to MyArray. I just think the symmetry of

BitArray(true, (n,m))
Array(true, (n,m))
MyArray(true, (n,m))

is way to elegant and generalizes great to other array types. Compare this to the current state

trues((n,m))
fill(true, (n,m))
MyArray(???)

which is not very nice. What to do in the last case? Perhaps something like MyArray(trues((n,m))) but perhaps MyArray does not need to allocate the full BitMatrix that this generates, then you have to come up with your own value-specific filler method just for your array type (compare for instance the sp- mess, where spzeros is needed since something like SparseArray(zeros(n,n)) would be a huge waste of memory or even be infeasible in most cases involving sparse matrices).

carlobaldassi commented 6 years ago

IMO they suffer from the same problems as zeros and ones.

These two will not entirely be deprecated, the basic forms will be kept and they will construct Array{Float64}s, because it's a common case and they're convenient, and they are not ambiguous in that case. It's not written anywhere that convenience constructor must always be generalizable to arbitrary cases; there are special and frequently used cases for which it's just nice to have a shorthand. The argument that you could write things in a more "elegant" or "nice" way (in the sense of syntactic consistency) doesn't mean that you should necessarily delete the shorthands. (By the way, I'm not exceedingly thrilled by fill either; that's not the point though.)

JeffBezanson commented 6 years ago

I do like the idea of tying this to the behavior of .=. My main worry is that some strange things lurk in there, and that using iterators is much simpler. For example it seems hard to predict what will happen when making an Any array.

andyferris commented 6 years ago

For example it seems hard to predict what will happen when making an Any array.

@JeffBezanson I assume this refers to Array(in, size) and not Array{Any}(in, size)? (I guess we write this as <:Any vs Any?) For the former, we could try eltype(in) for the element type - would that be tricky with Generators?

I think it's a bad idea to have it depend on whether x has a size because then it is semantically inconsistent whether one wants to convert x to be an Array{eltype(x)} vs wants to create an Array{typeof(x)} filled with x.

@dlfivefifty Yes, I think it's important to get this right.

The main ambiguous case your raise should be when x is an AbstractArray, in which case I'd say the operation is a copy (like the equivalent convert except never lazy). Just like for broadcast, you'd want to use something like Scalar (from StaticArrays.jl - it's a single element, zero-dimensional array) to indicate when you want the array x broadcasted to each output element. (Like Vector(Scalar([1,2,3]), 100) makes a length-100 vector filled with [1,2,3]).

Another ambiguity is whether x is a size or the object to fill the output with. We have Vector{T}(3) creating a length-3 vector, but 3 has length of 1 so the result could also be [3]. Having a special rule when the "fill value" is an integer (or tuple of integers) seems unfortunate.

I have yet to see a good solution for a wide range of arrays including constructing arrays who's elements can't be mutated after construction, so I feel it is worth exploring if these ambiguities can be overcome with some semantically clear and simple rules.

andyferris commented 6 years ago

Another ambiguity is whether x is a size or the object to fill the output with. We have Vector{T}(3) creating a length-3 vector, but 3 has length of 1 so the result could also be [3]. Having a special rule when the "fill value" is an integer (or tuple of integers) seems unfortunate.

Actually deprecating Vector{T}(3) to create a length-3 vector is a goal of this issue (https://github.com/JuliaLang/julia/issues/24595#issuecomment-345468996), so scratch this I guess.

fredrikekre commented 6 years ago

Another ambiguity is whether x is a size or the object to fill the output with. We have Vector{T}(3) creating a length-3 vector, but 3 has length of 1 so the result could also be [3]. Having a special rule when the "fill value" is an integer (or tuple of integers) seems unfortunate.

This shouldn't be ambiguous anymore since we require uninitialized to obtain an uninitialized array. So AFAICT we can always consider the first element as the filler?

andyferris commented 6 years ago

Yep just realized :)

dlfivefifty commented 6 years ago

Like Vector(Scalar([1,2,3]), 100) makes a length-100 vector

I think it's never a good design where the type of the data indicates behaviour, because one ends up with code like Vector(x, n) where it's not always clear what type x is.

In the case of Vector([1,2,3], 100), there is no ambiguity: one doesn't specify a dimension when asking for a copy, hence the only way it makes sense is as a vector filled with [1,2,3] 100 times.

With Matrix{Float64}(::Int, ::Int) deprecated in favour of Matrix{Float64}(uninitialized, ::Int, ::Int), that makes things a lot more consistent (in fact, if the first argument is a fill argument, it makes a lot of sense for "filling" with unitialized to be the uninitialized matrix). So here is my updated suggestion:

A = rand(3,3)  # or a MyArray
x = 1
Array(A)      # Returns an `Matrix{Float64} == Array{typeof(x), ndims(x)}` whose entries are specified by `A[k,j]`
Array{T}(A) # Returns an `Matrix{T}  == Array{T, ndims(x)}` whose entries are specified by `convert(T, A[k,j])` 
Array(x)      # Returns an `Array{Int,0} == Array{typeof(x), ndims(x)}` with single entry `x`
Array{T}(x)    # Returns an `Array{T,0} == Array{T, ndims(x)}` with single entry `convert(T, x)`
Array{T}(unitialized, x)   # Returns a `Vector{T}` of length x 
Array(A, (n,))    # Returns a `Vector{Matrix{Float64}}` filled with `A` `n` times
Array(x, (n,))   # Returns a `Vector{Int}` filled with `x` `n` times
Array(x, n)   # Returns a `Vector{Int}` filled with `x` `n` times
Array{T}(unitialized, x, n)    # Returns an `x` by `n` `Matrix{T}`
Sacha0 commented 6 years ago

Thanks for the great discussion all!

@dlfivefifty, these behaviors

A = rand(3,3)  # or a MyArray
x = 1
Array(A)      # Returns an `Matrix{Float64} == Array{typeof(x), ndims(x)}` whose entries are specified by `A[k,j]`
Array{T}(A) # Returns an `Matrix{T}  == Array{T, ndims(x)}` whose entries are specified by `convert(T, A[k,j])` 
Array(x)      # Returns an `Array{Int,0} == Array{typeof(x), ndims(x)}` with single entry `x`
Array{T}(x)    # Returns an `Array{T,0} == Array{T, ndims(x)}` with single entry `convert(T, x)`
Array{T}(unitialized, x)   # Returns a `Vector{T}` of length x
...
Array{T}(unitialized, x, n)    # Returns an `x` by `n` `Matrix{T}`

are consistent with the general idea of interpreting the first argument as an iterable from which to populate the result (the working proposal). The following behaviors

Array(A, (n,))    # Returns a `Vector{Matrix{Float64}}` filled with `A` `n` times
Array(x, (n,))   # Returns a `Vector{Int}` filled with `x` `n` times
Array(x, n)   # Returns a `Vector{Int}` filled with `x` `n` times

are a bit trickier. Interpreting the first argument as an iterable from which to populate the result, Array(A, (n,)) should yield a Vector{eltype(A)} of length n populated from iterating A (which makes sense where n <= length(A)), and Array(x, (n,))/Array(x, n) should yield a Vector{eltype(x)} of length n populated from iterating x (which makes sense where n <= length(x) == 1). Best!

dlfivefifty commented 6 years ago

Hmm, I think I agree. If keywords become fast, how about

Array(n; fill=x) # returns a length `n` `Vector{typeof(x)}` with entries `x`
dlfivefifty commented 6 years ago

I've noticed the following deprecation:

zeros(a::AbstractArray) is deprecated, use fill!(similar(a), 0) instead

I don't think the replacement for zeros(a) is compatible with working with other array types. In an ApproxFun.jl branch, I went with Array(Zeros, A) and BandedMatrix(Zeros, A), with the latter automatically getting its size and bandwidths from A whenever A implements the banded matrix interface. But I'm not sure that's a particularly good solution.

Perhaps if A::AbstractArray we could deprecate zeros(A) with

Array(A; fill=0)

Then BandedMatrix(A; fill=0) would get both its size and bandwidths from A.

Sacha0 commented 6 years ago

@dlfivefifty, I would expect similar(A::BandedMatrix) to return a BandedMatrix similar to A in all respects but for data, including shape and bandwidths. The banded matrix types in Base (namely Diagonal, Bidiagonal, Tridiagonal, and SymTridiagonal) behave in this manner, and with that behavior fill!(similar(A), 0) serves as a direct replacement for zeros(A::AbstractArray) on these types:

julia> fill!(similar(Tridiagonal([1,2,3], [1,2,3,4], [1,2,3])), 0)
4×4 Tridiagonal{Int64,Array{Int64,1}}:
 0  0  ⋅  ⋅
 0  0  0  ⋅
 ⋅  0  0  0
 ⋅  ⋅  0  0

How does similar(A::BandedMatrix) behave?

Alternative rewrites for zeros(A::AbstractArray) exist and may serve better in various situations, for example zero(A), fill!(copy(A), 0), or (forthcoming) fillstored!([copy|similar](A), 0). zeros(A::AbstractArray) rewrite questions having popped up a couple times now (https://github.com/JuliaLang/LinearAlgebra.jl/issues/484, https://github.com/JuliaLang/julia/pull/24656), so I have made a note to expand the deprecation warnings with additional rewrites and explanation.

Note that this discussion highlights the ambiguity in zeros(A::AbstractArray), and that zeros(A::AbstractArray, ...) = fill!(similar(A, ...), 0) was how this function was defined in Base :). Thanks and best!

dlfivefifty commented 6 years ago

Sorry, let me clarify: sometimes when I call zeros(A::BandedMatrix) I’m actually after a Matrix of the same dimensions as A. But sometimes I want a BandedMatrix with the same dimensions and bandwidths as A. fill!(similar(A), 0) only gives me the latter.

Sacha0 commented 6 years ago

Ah, I see :). fill!(Matrix(A), 0) for the latter operation? Best!

dlfivefifty commented 6 years ago

No because that copies all the entries first

Sacha0 commented 6 years ago

You mean fill!(Matrix{eltype(A)}(size(A)), 0) (on stable, and with uninitialized in the Matrix constructor on master)?

dlfivefifty commented 6 years ago

That's fine for Matrix, though very long, but not for BandedMatrix where one would need to test if A satisfies the banded matrix interface (isbanded(A)) to decide whether to call `bandwidth or not.

And in ApproxFun I'm doing code like

for TYP in (:Matrix, :BandedMatrix, :RaggedMatrix, :BlockMatrix, :BandedBlockBandedMatrix, :BlockBandedMatrix, :AlmostBandedMatrix)
    @eval function convert(::Type{$TYP}, A::YetAnotherType)
          ret = $TYP(Zeros, A)
          # populate ret
          ret
    end
end

I see no way to do this general pattern where each matrix type has different parameters using something like fill!(Matrix{eltype(A)}(size(A)), 0).

Sacha0 commented 6 years ago

Thanks for the clarifying code! I think I see what you are after now; I touched on an extension to the OP's second proposal in https://github.com/JuliaLang/julia/issues/11557#issuecomment-341512515 that is quite similar to what you are looking for and I agree could be lovely and useful. These ideas also head into JuliaLang/julia#18161 territory. Note that the operations you are after were not covered by zeros(A::AbstractArray)'s contract. Best!

mschauer commented 6 years ago

In https://github.com/JuliaLang/julia/issues/16029#issuecomment-342577866, @Nalimilan proposed to capture the cases where collect(X) could possibly return something else than an Array depending on X by AbstractArray(X), which is understood to collect to the most appropriate mutable container type (with Array as a fallback).

There is a similar subproposal above for e.g. SomeArray(Rep(NaN), A). The difference is that SomeArray(...) collects to SomeArrays and AbstractArray(...) collects to whatever it can infer from the argument or Iterator-traits. A possible use case is

AbstractArray(NaN for a in A)

which can return a MVector filled with NaNs (e.g.) when A is a static vector and this is known via a trait.

The introduction of AbstractArray(...) with this meaning allows to recover the generality which is lost by replacing collect with Vector as non-breaking change even later on, so one can proceed with JuliaLang/julia#16029 .

Sacha0 commented 6 years ago

There is a similar subproposal above for e.g. SomeArray(Rep(NaN), A). The difference is that SomeArray(...) collects to SomeArrays and AbstractArray(...) collects to whatever it can infer from the argument or Iterator-traits.

Please note that AbstractArray(Rep(NaN), A) in that subproposal provides precisely the behavior you seek in AbstractArray(NaN for a in A) :).

mschauer commented 6 years ago

They go together, like Array(it, shape...) and Array(it) have both their place.

RalphAS commented 6 years ago

A recent discussion on discourse shows that many in the user community want a convenience constructor for the uninitialized case. This is qualitatively different from ones() etc., and the third razor in the OP seems to imply that it should be done, properly, in Base. Is there some reason it couldn't simply be a callable instance?

(uninitialized)(T::DataType,dims...)=Array{T}(uninitialized,dims)