implement `@fill([0, 0], 2)`

CameronBieganek commented 3 years ago

Original Title: Change fill([0, 0], 2) behavior for 2.0

The behavior of fill when its first argument is an array is to create an array of arrays, where each of the elements is a reference to the exact same array in memory. This is counterintuitive to many users (see Discourse links below), and it seems kind of pointless. I propose that in 2.0 fill with an array input should fill the resulting array with copies of the input array.

Below are some examples where this came up on Discourse. There may be more.

https://discourse.julialang.org/t/simple-question-about-assignment-to-a-vector-of-vectors/62744 https://discourse.julialang.org/t/fill-anarray-2-behaviour/22429 https://discourse.julialang.org/t/initialization-of-array-of-arrays-with-fill-ones-1-2-2-only-one-vector-is-created/48048 https://discourse.julialang.org/t/how-can-i-fill-an-array-with-empty-2d-arrays/60895

simeonschaub commented 3 years ago

Copying by default seems even more confusing to me. This would cause lots of allocations any time an array is filled with mutable objects. There are many situations where you actually want to fill an array with references to the same mutable object, for example pretty much any time you don't plan to mutate the items in an array themselves, which is a frequent use case when working with DataFrames. Implicit copies can also lead to wrong results in some cases, for example Measurements.jl uses references to other measurements to keep track of correlations. It is also not clear how this should behave for recursively mutable objects: if it's just the default copy behavior, you would run into exactly the same issues when filling an array with arrays of arrays, you could use deepcopy, but that can be super slow and can easily blow up allocations for more complex objects.

I think we should just document the current behavior better. Once you have a basic understanding how memory and references work, the current behavior makes a lot of sense, so I don't think that needs to be changed. One thing we could do is add a copy keyword argument to fill though, so if someone wants the copying behavior, it's really easy to get.

JeffBezanson commented 3 years ago

Fully agree with Simeon. Calling copy internally is a hack. I think what you really want here is to evaluate the argument expression for each element, not just get a copy, e.g. consider something like fill(rand(), n). Then the copy behavior, if you want it, is an easy special case. The easiest way to do this now is with a comprehension, but the syntax is unfortunately very different from fill. Maybe a @fill(expr, ...) that does repeated evaluation?

simeonschaub commented 3 years ago

Maybe a @fill(expr, ...) that does repeated evaluation?

Does that really add something over just using an array comprehension though?

JeffBezanson commented 3 years ago

Not really, it's just shorter.

PaulSoderlind commented 3 years ago

I believe those discourse threads signal that the documentation could be improved. If x is an object reference, all elements will refer to the same object: is somewhat opaque. Adding the example (done in https://github.com/JuliaLang/julia/pull/35683) was a step in the right direction, but perhaps not enough.

In fact, the expression object reference does not seem to be widely used in the docs. Searching the pdf gives 4 hits: 2 on unsafe_pointer... and 2 on fill/fill!. Is there a better (simpler) terminology?

goretkin commented 3 years ago

Is it wrong to say if x is mutable (ismutable(x)) instead of if x is an object reference? (yes, because of https://github.com/JuliaLang/julia/issues/30210 , but otherwise?)

PaulSoderlind commented 3 years ago

@goretkin

Could I ask you submit a PR - and then we see how this plays out? Maybe we could try If x is mutable (ismutable(x)), for instance, an Array, then all elements will refer to the same object:?

simeonschaub commented 3 years ago

ismutable(x) here is not quite correct, since for example:

julia> ismutable(1 => Ref(2))
false

while this will still run into the same issues as mentioned above.

PaulSoderlind commented 3 years ago

Thanks. I (and several others, it seems) am still struggling with If x is an object reference, all elements will refer to the same object. I guess just changing to If x is an object reference (for instance, an array), all elements will refer to the same object would be a step forward, but maybe you have a better idea?

mbauman commented 3 years ago

There's no if. All elements always are the same object. The only time it's observable, though, is if there's something mutable.

CameronBieganek commented 3 years ago

I concede that there isn't really a sensible way to do this. We would need to use either a macro or a function myfill whose first argument is a function (the name of the function would have to be different from fill, since the first argument of fill is generic for the existing method definitions, unless we want to use the f::Union{Function, Type} hack). At that point it makes more sense to just use a comprehension or a generator.

I'm not too invested in this anyways---I just noticed that this question came up a lot on Discourse.

CameronBieganek commented 3 years ago

On second thought, could we branch on isbits? If an object is not isbits, then we do a deepcopy?

julia> myismutable = !isbits
#76 (generic function with 1 method)

julia> myismutable([1, 2])
true

julia> myismutable(1 => Ref(2))
true

julia> struct A
           x::Vector{Int}
       end

julia> myismutable(A([1, 2]))
true

simeonschaub commented 3 years ago

That wouldn't make any difference, since deepcopy is just a no-op for isbits objects.

CameronBieganek commented 3 years ago

Yeah, I just noticed that deepcopy has

isbitstype(typeof(x)) && return x

for its first line.

That wouldn't make any difference, since deepcopy is just a no-op for isbits objects.

But that's fine. We want it to be a no-op for isbits objects! The point is that its not a no-op if the object is not isbits.

simeonschaub commented 3 years ago

That wouldn't address any of the issues raised by Jeff and myself above though.

goretkin commented 3 years ago

ismutable(x) here is not quite correct, since for example:
julia> ismutable(1 => Ref(2))
false
while this will still run into the same issues as mentioned above.

@simeonschaub , right, good. Is 1 => Ref(2) a so-called "object reference"? Is there a programmatic to check this property, whatever we want to call it? A deepismutable, if you will. A non-executable definition that I suspect is nonetheless accurate, is that isequal(a, b) implies a === b

CameronBieganek commented 3 years ago

There's clearly some demand for the copying behavior. In order to keep the current fill behavior available, the new behavior could be a different function, like copyfill or fillcopy, or it could be available via a copy keyword argument as you suggested.

Though I don't have any data to back up this hypothesis, I would guess that most people who use fill([0, 0], 2) are looking for the copying behavior rather than the current behavior.

simeonschaub commented 3 years ago

Is 1 => Ref(2) a so-called "object reference"?

That's a good question. I don't think the current wording is very accurate, it should probably say "object contains any references" rather than "object is a reference". I also agree with Matt that "all elements will refer to the same object" is not really different between objects with and without references, we should clarify that the only difference is that this is directly observable with references, while you wouldn't be able to tell for immutable objects.

goretkin commented 3 years ago

the new behavior could be a different function, like copyfill or fillcopy

My inclination (as always) in these situations is to introduce a new type name and rely on multiple dispatch, as opposed to introducing a new function name. The call would then be something like fill(Copier([0, 0]), 2). The name Copier is just an example. Ideally this type would be beneficial in other interfaces too.

The benefit over fill(Copier(... over fillcopy is "orthogonality". A method can be written in terms of fill(x, ...), and if x comes from the caller of that method, then both options are possible, without making a commitment between fill and fillcopy.

A downside has something to do with a proliferation of wrapper types and not a lot of great idioms for dealing with them, especially when there are multiple wrappers. [EDIT] another downside is that now you can't fill an array with Copier objects without introducing some sort of escape mechanism. I think an array-equivalent of ntuple that takes a function of an index is the way to go.

CameronBieganek commented 3 years ago

@goretkin

A deepismutable, if you will.

As far as I can tell, you could define deepismutable(x) = !isbits(x). But maybe I'm missing some corner cases.

CameronBieganek commented 3 years ago

I also think that just updating the documentation a bit would be fine. We could add a sentence that suggests using a comprehension if you want to make copies of a non-isbits object. Something like this:

Add this to `fill` docstring:

If you want to fill an array with copies of a non-isbits object, use map or a comprehension:

julia> struct A
           x::Vector{Int}
       end

julia> map(_ -> A([1, 2]), 1:3)
3-element Vector{A}:
 A([1, 2])
 A([1, 2])
 A([1, 2])

julia> [A([1, 2]) for _ in 1:3]
3-element Vector{A}:
 A([1, 2])
 A([1, 2])
 A([1, 2])

goretkin commented 3 years ago

@goretkin

A deepismutable, if you will.

As far as I can tell, you could define deepismutable(x) = !isbits(x). But maybe I'm missing some corner cases.

That seems right! I already mentioned strings earlier. I am pretty confused about strings with respect to mutability. (Because String is "semantically" immutable, but for implementation reasons are "internally" mutable, for now). I wonder if that's a corner case here too:

julia> isbits("hey")
false

though I don't think there is any non-unsafe way to wreak havoc if a user does e.g. fill("hey", 2)

mbauman commented 3 years ago

See if the my suggested changes to fill in #41340 are any easier to understand. This is a very common conceptual misunderstanding and is fundamental to understanding Julia, but can be difficult to succinctly and accurately describe. That means, though that it's not just fill. Understand this, conquer ~~the world~~ Julia.

CameronBieganek commented 3 years ago

This is a very common conceptual misunderstanding and is fundamental to understanding Julia, but can be difficult to succinctly and accurately describe. That means, though that it's not just fill. Understand this, conquer ~the world~ Julia.

To be clear, I already understood that. I just noticed that the fill question came up a lot on Discourse. And filling an array with references to the same value did not seem very useful to me.

BioTurboNick commented 2 years ago

I'll just put in a vote here for making a copying fill (just the top level, not a deep copy) the default in 2.0 and a non-copying fill the "advanced" version. Whether that's a kwarg switch or another function, could go either way.

Though if someone can show there's a really common case that would get slowed down and/or become annoying because they keep having to add a new argument to avoid copying, that would be a good reason to not do this.

KristofferC commented 2 years ago

This seems simple to me... One just introduces a new function, say fillf, with the desired semantics. It takes a function as the first argument and calls that function for every element.

fillf(rand, 5, 5)
fillf(() -> zero(2,2), 3, 3)

etc. And (which has been mentioned) there could be a @fill(expr, ...) that transforms into fillf(() -> expr, ...).

goretkin commented 2 years ago

e.g.

fillf(f, args...) = [f() for k in keys(fill(nothing, args...))]

It seems like you might as well pass the indices to f, analogous to ntuple, but then you have to decide how to pass it (a CartesianIndex is inconvenient) and if you want to ignore it, you'll have to include a dummy argument (like _) in the anonymous function.

mbauman commented 2 years ago

Though if someone can show there's a really common case that would get slowed down and/or become annoying because they keep having to add a new argument to avoid copying, that would be a good reason to not do this.

Simeon listed a number of these in this comment above: https://github.com/JuliaLang/julia/issues/41209#issuecomment-860201785. It's not about performance or annoyance — it's about semantics and what you should expect from the language. fill(x, size...) just puts that x everywhere.

At its heart, I think this is a natural language/computer language mismatch. I'm hopeful that the new documentation in v1.8 will help some. Maybe there'd be a different English-language verb that would more clearly express this behavior, but I'm doubtful we could find anything dramatically clearer; in the natural world we typically can't put the same thing in multiple places at once.

I don't think a higher order function like fillf or macro like @fill would be helpful to the folks that stumble on this.

PaulSoderlind commented 2 years ago

at the new documentation in v1.8

yes, that will help. Much appreciated.

I don't think a higher order function like fillf or macro like @fill would be helpful

I disagree here. Some sort of convenience function for pre-allocating an array of arrays could be very helpful.

goretkin commented 2 years ago

I disagree here. Some sort of convenience function for pre-allocating an array of arrays could be very helpful.

I think @mbauman is saying that if a naïve user knows that they can do fill(0, ...) to get an array of 0s, then they will also do fill([0], ...) to get an array of arrays. The existence of fillf or @fill won't automatically get them to use it if they don't already know better, which they don't, since they are stumbling upon this issue.

It could still be worth having a function like fillf to give an alternative spelling for an array comprehension. But really that is the job of map.

PaulSoderlind commented 2 years ago

OK, fair enough. I think there are two points of having a fillf: (1) it is convenient; (2) it highlights the potential pitfall of fill: the documentation of the latter could read like "To create an array of many independent inner arrays, use a comprehension or fillf()" .

StefanKarpinski commented 2 years ago

The main advantage I can see for @fill is that if you have a fill([], n) or whatever and you realize it's wrong, you can fix it very simply by changing it to @fill([], n) which is appealing. I still strongly feel that changing fill to copy its argument is a bad idea that just makes it harder for people to understand and use the language. With copying they do fill([], n) and it works, making them think that it has the same semantics as [[] for _ = 1:n]. Later they do fill(rand(), n) and find that it produces an array of n copies of the same random number. The copying behavior has misled them about the basic semantics of the language and caused confusion rather than really helping. The sooner someone understands that a function argument is only evaluated once, the better off they are.

mbauman commented 2 years ago

Yeah, my reasoning behind saying that fillf and @fill are unhelpful is that:

Higher order functions and macros are more advanced techniques, whereas this is often stumbled upon by language learners.
They're fiddly variations on the same name... and the name itself doesn't actually give any indication about why they're different to someone who doesn't understand what fill itself is doing. In the documentation for fill, we can say "if you don't want this behavior, use @fill", but I'd rather say "if you don't want this behavior use a comprehension." The former is magic, the latter is generalizable.

I'm not averse to a more succinct comprehension-like syntax, but I'd want something whose name is a little more clearly distinct from fill... because the behavior is quite distinct.

StefanKarpinski commented 2 years ago

I'm not that worried about beginners—they're going to be confused at some point and need to learn this either way and as you say, fill versus @fill isn't that enlightening unless you already understand this. The reason I think @fill would be good is for users who understand this to be able to easily switch between the behaviors. Yes, you can use a comprehension, but if you have already written fill([], n) and realize you need different arrays, changing it to @fill([], n) is much easier, and shorter than writing [[] for _ = 1:n].

CameronBieganek commented 2 years ago

Given the change in the name of this issue, can we remove the "breaking" label?

mbauman commented 3 months ago

I wonder if VSCode/Lint could help flag places where any value in a fill is mutated? It's probably always a mistake to do so!

JuliaLang / julia