JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.75k stars 5.49k forks source link

design of array constructors #24595

Open Sacha0 opened 7 years ago

Sacha0 commented 7 years ago

Intro

This post is an attempt to consolidate/review/analyze the several ongoing, disjoint conversations on array construction. The common heart of these discussions is that the existing patchwork of array construction methods leaves something to be desired. A more coherent model that is pleasant to use, adequately general and flexible, largely consistent both within and across array types, and consists of orthogonal, composable parts would be fantastic.

Some particular design objectives / razors might include:

(*Common operations include constructing: (1) an array uniformly initialized from a value; (2) an array filled from an iterable, or from a similar object defining the array's contents such as I; (3) one array from another; and (4) an uninitialized array.)

Via a tour of the relevant issues and with the above in mind, let's explore the design space.

Tour of the issues

Let's start with...

The prevailing idea for fixing the preceding issue is to: (1) deprecate uninitialized-array constructors that accept (solely) a tuple or series of integer arguments as shape, removing the method signature collision; and (2) replace those uninitialized-array constructors with something else. Two broad replacement proposals exist:

  1. Introduce a new generic function, say (*modulo spelling) blah(T, shape...) for type T and tuple or series of integers shape, that returns an uninitialized Array with element type T and shape shape. This approach is an extension of the existing collection of Array convenience constructors inherited from other languages including ones, zeros, eye, rand, and randn.

(* Please note that blah is merely a short placeholder for whatever name comes out of the relevant ongoing bikeshed. The eventual name is not important here :).)

  1. Introduce Array{T}(blah, shape...) constructors where blah signals that the caller does not care what the return's contents are. These constructors would be specific instances of a more general model that extends and unifies the existing constructor model. That more general model is discussed further below.

The first proposal

The first proposal leads us to...

The de facto approach is introduction of ad hoc perturbations on these function names for each new array type: Devise an obscure prefix associated with your array type, and introduce *ones, *zeros, *eye, *rand, *randn, and hypothetically *blah functions with * your prefix. This approach fails all three razors above: Failing the first razor, to construct an instance of an array type that follows this approach, you have to discover that the array type takes this approach, figure out the associated prefix, and then hope the methods you find do what you expect. Failing the second razor, when you encounter the unfamiliar bones function in code, you might guess that function either carries out spooky divination rituals, or constructs a b full of ones (whatever b refers to). Along similar lines, does spones populate all entries in a sparse matrix with ones, or only some set of stored/nonzero entries (and if so which)? Failing the third razor, the very nature of this approach is proliferation of ad hoc convenience functions and is itself an antipattern. On the other hand, this approach's upside is that it sometimes involves a bit less typing (though often also not, see below). Nonetheless, this approach is fraught.

So what's the other approach? #11557 started off by discussing that other approach: ones, zeros, eye, rand, and randn typically accept a result element type as either first or second argument, for example ones(Int, (3, 3)) and rand(MersenneTwister(), Int, (3, 3)). That argument could instead be an array type, for example ones(MyArray{Int}, (3, 3)) and rand(MersenneTwister(), MyArray{Int}, (3, 3)). This approach is enormously better than the last: It could mostly pass the first and second razors above. But it nonetheless fails the third razor, and exhibits other shortcomings (mostly inherited from the existing convenience constructors). Let's look at some of those shortcomings:

Each of these shortcomings is perhaps acceptable considered in isolation. But considering these shortcomings simultaneously, this approach becomes a shaky foundation on which to build a significant component of the language.

In part motivated by these and other considerations, #11557 and concurrent discussion turned to...

The second proposal

... which is to introduce (modulo spelling of blah, please see above) Array{T}(blah, shape...) constructors, where blah indicates the caller does not care what the return's contents are. These constructors immediately generalize to arbitrary array types as in MyArray{T}(blah, shape_etc...), and would be a specific instance of a more general model that extends the existing constructor model:

The existing constructor model allows you to write, for example, Vector(x) for x any of 1:4, Base.OneTo(4), or [1, 2, 3, 4] (to construct the Vector{Int} [1, 2, 3, 4]), or similarly SparseVector(x) (to build the equivalent SparseVector). To the limited degree this presently works broadly, the model is MyArray[{...}](contentspec) where contentspec, for example some other array, iterable, or similar object, defines the resulting array's contents.

The more general extension of this model is MyArray[{...}](contentspec[, modifierspec...]). Roughly, contentspec defines the result's contents, while modifierspec... (if given) provides qualifications, e.g. shape.

What does this look like in practice?

For the most part you would use constructors as you do now, with few exceptions. Let's go through the common construction operations mentioned above:

  1. (Constructing uninitialized arrays.) To build an uninitialized MyArray{T}, where now you write e.g. MyArray{T}(shape...), instead you would write MyArray{T}(blah, shape...). (#24400 explored this possibity for Arrays, and inevitably became a bikeshed of the spelling of blah :).)

  2. (Constructing one array from another.) Constructing one array from another, as in e.g. Vector(x) or SparseVector(x) for x being [1, 2, 3, 4], would work just as before.

  3. (Constructing an array filled from an iterable, or from a similar object defining the array's contents such as I.) What is possible now, for example Vector(x) for x either 1:4 or Base.One(4), would work as before. But where e.g. Array[{T,N}](tuple) now fails or produces an uninitialized array depending on T, N, and tuple, such signatures could work as for any other iterable. And additional possibilities become natural: Constructing Arrays from HasShape generators is one nice example. Another, already on master (#24372), is Matrix[{T}](I, m, n) (alternatively Matrix[{T}](I, (m, n))), which constructs a Matrix[{T}] of shape (m, n) containing the identity, and is equivalent to eye([T, ]m[, n]) with fewer ambiguities.

Great so far. Now what about perhaps the most common operation, i.e. constructing an array uniformly initialized from a value? Under the general model above, this operation should of course roughly be MyArray[{T}](it, shape...) where it is an iterable repeating the desired value. But this incantation should: (a) be fairly short and pleasant to type, lest ad hoc constructors for particular array types and values proliferate to avoid using the general model; and ideally (b) mesh naturally with convenience constructors for Arrays.

Triage came up with two broad spelling possibilities. The first spelling possibility led to...

The second spelling possibility is MyArray(Rep(v), shape...) modulo spelling of Rep(v), where Rep(v) is some convenient alias for Iterators.Repeated(v) with v any desired value. (Another possible spelling of Rep(v) discussed in triage is Fill(v), which dovetails beautifully with the fill convenience constructor for the same purpose specific to Arrays. Independent of the iterator's name, this spelling is a clean generalization of fill from Arrays to arrays generally.) In practice this would look like MyArray(Rep(1), shape...) (instead of MyArray{Int}(ones, shape...)). This spelling possesses some distinct advantages:

Great. With this latter spelling, overall this second proposal appears to satisfy both the broad design objectives and three razors at the top, and avoids the shortcomings of the first proposal.

What else? Convenience constructors

Convenience constructor are an important part of this discussion and about which there is much to consider. But that topic I will leave for another post. Thanks for reading! :)

JeffBezanson commented 6 years ago

I would also accept renaming it to uninit. That is one long word.

andyferris commented 6 years ago

c.f. #undef - if we're looking for a new name for these things, it could be nice to normalize terminology.

carstenbauer commented 6 years ago

But uninit and #undef mean different things, right? AFAIU uninit means I don't care about the initialization whereas #undef represents not filled at all (truly empty array).

andyferris commented 6 years ago

#undefs can only occur as uninitialized values of Arrays (or fields of structs), which AFAICT are set to zero pointers by the compiler (zero pointers could also come from structs and arrays returned by foreign calls, I suppose). The main difference is for isbits types whose uninitialised value is whatever RAM happens to be there, since no pointer is involved. It may well be that a large enough difference exists such that two terminologies is better than one, but maybe it’s worth thinking about.

I was just thinking it might be nice to be able to use something like uninitialized in struct inner constructors (i.e. Expr(:new, ...) expressions where I can indicate exactly which fields to leave uninitialized - but then maybe that would need to be a language keyword or something since if it is an object I should be able to throw it inside a struct?).

ChrisRackauckas commented 6 years ago

Bringing up https://github.com/JuliaLang/julia/issues/25107 since it seems a lot of this proposal mentions shape, when shape isn't enough information to fully characterize the structure of many abstract arrays.

AzamatB commented 5 years ago

I often find myself wanting to generate random Symmetric PSD Matrix, so end up writing something like

julia> S = rand(4,4); S = S'S + I

It would be nice to abstract this away by adding constructors for generating random structured matrix types (like Symmetric, Diagonal, Orthogonal, Orthonormal, etc.) maybe after #8240 is settled.