JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.91k stars 5.49k forks source link

Should `copy` be renamed `shallowcopy`? #42796

Open timholy opened 3 years ago

timholy commented 3 years ago

This is a potential Julia 2.0 change that arose while teaching newbies. One student got bit by copy's shallow behavior; this is also a common issue on discourse and stackoverflow. (Those are examples, and there are duplicates.) A potential solution is to not have copy at all, instead offering a choice between shallowcopy or deepcopy.

This has downsides: (1) shallowcopy is longer to type, (2) other languages (e.g., Python) make the same naming choices we're currently making, and (3) it might cause more people to use deepcopy out of safety in cases where it's not actually necessary (deepcopy is much slower).

The main upside is that by naming it shallowcopy we encourage people to understand what it's doing and thus to think about whether they want shallow or deep in this particular instance. Bugs arising from a shallow copy can be quite difficult to diagnose, typically costing orders-of-magnitude more time than typing a few extra characters.

A downside-to-the-upside is that this isn't unique to copy: fill, pass-by-reference, and other circumstances can trigger what is essentially the same confusion, and I don't see as simple a solution for these.

mgkuhn commented 3 years ago

Perhaps adding a “See also: deepcopy” reference (or an explanatory sentence for when to use it instead) to the docstring of copy would help to address the main concern, namely that beginners don't discover deepcopy?

I know, these doc-strings appear in the Base reference manual right next to each other, but that is of no help to a user of the REPL or GUI help functions, which show these docstrings in isolation.

Perhaps even add an example?

julia> a = [1, [2,3]]; b = a; b[1]=4; b[2][1]=5; a
2-element Vector{Any}:
 4
  [5, 3]

julia> a = [1, [2,3]]; b = copy(a); b[1]=4; b[2][1]=5; a
2-element Vector{Any}:
 1
  [5, 3]

julia> a = [1, [2,3]]; b = deepcopy(a); b[1]=4; b[2][1]=5; a
2-element Vector{Any}:
 1
  [2, 3]
JeffBezanson commented 3 years ago

We should not encourage anybody to call deepcopy. It seems like the right thing if you're just experimenting around with nested arrays and such, but it is almost never the right choice in real code, and using it is a sign that the design and data model needs to be thought through more. So really there is only copy, and the way to think about it is that it operates only on the object you pass it. deepcopy is really a very different function; it almost belongs in the Serialization package or something of that sort.

timholy commented 3 years ago

Given that this agrees with my own coding practice, can't say I disagree. It provokes the question of how widely it is used: on my ~/.julia/dev directory (about 430 packages), I get

tim@diva:~/.julia/dev$ grep -Rc deepcopy | grep -v ":0$" | grep "test/" | wc
     40      40    1528
tim@diva:~/.julia/dev$ grep -Rc deepcopy | grep -v ":0$" | grep "src/" | wc
     90      90    3194

That's a very small fraction of all source files, but to my surprise it's more widely used in src/ code than test/ code (deepcopy arguably has much more place in test/ than src/). I guess most packages have more code in src than test so maybe that ratio isn't too crazy. But if you look up some of the individual files (just leave off the final | wc) it's a bit surprising to see which ones they are (some pretty mainline packages as well as more niche ones).

The question remains, though: what more can we do to clarify how copy behaves? The current docstring https://github.com/JuliaLang/julia/blob/3d4b213a6d751e07fd9becbc17221868af638117/base/array.jl#L358-L366 probably can be called "non-newbie friendly" since it relies on understanding the implications of "shallow" and "identically-same." Those are great if you already understand the distinction between == and === or the jargon of shallow vs deep, but I can attest that's something that a lot of capable users from non-CS fields don't know coming into Julia. Of course, my proposal to rename this shallowcopy is not successful in that either, other than directing attention to the fact that this may be more subtle than a totally-naive perspective might imagine.

Let me see if I can come up with something clearer (in a few days), and others can chime in.

antoine-levitt commented 3 years ago

I think the first sentence is fine, it's just the second that's not very exemple-y. Maybe just "For example, copy([[1]])[1][1] = 2 also modifies the original inner array"?

mgkuhn commented 3 years ago

While we are improving the docstring of copy, we should also do so for deepcopy. Both docstrings currently fall well short of giving a precise and unambiguous specification of what exactly the user can expect these functions to do. For example, in the case of deepcopy, there is no mention of what happens if the data structure is a DAG or even a graph with loops. Does deepcopy deduplicate references, or do DAGs get turned into trees and loops cause endless recursion? All legitimate questions that the docstring should provide clear answers for. The current docstrings still read a bit like temporary placeholders, to be replaced with the real documentation later.

Jargon such as “shallow” and “deep” should be introduced, not assumed. The documentation of these functions is also complicated by the fact that Julia's memory model is not at all obvious from its syntax, e.g. mutable and immutable structs look exactly the same except for a keyword, but may be treated quite differently in a “shallow” copy. I wonder if it wouldn't be useful to add a visualization tool that shows what copy actually would copy, e.g. a macro to display a data structure that colors the “shallow” and “deep” information in it differently (a bit like the much loved @code_warntype), to explain which parts get copied by assignment, by copy, or only by deepcopy.

mgkuhn commented 3 years ago

copy([[1]])[1][1] = 2

You need to bind these values to names, such that you can reference them. Otherwise the example does not have any observable outcomes (and in fact the optimizer would be entitled to eliminate it). So you can't make that example much shorter than

julia> a=[[1]];copy(a)[1][1]=2;show(a)
[[2]]
timholy commented 3 years ago

While it doesn't yet address the docstring issue, I wrote a lengthy explanation in my course documents: https://github.com/timholy/AdvancedScientificComputing/blob/main/homeworks/learning_julia2.md#distinguishing-identity-and-equality-and-containers-and-their-collection-of-items

I'd be happy to move that to Julia's docs somewhere if that seems desirable. Maybe the FAQ? Do we want an "Explanations" section that covers basic concepts of computing using Julia as the language? As I start thinking about teaching Julia even to bare-beginners, referencing C documentation or even Python to explain these core concepts starts becoming less and less satisfying. OTOH, one might say that this material should really just go in textbooks and doesn't belong anywhere in this repo.

kruxigt commented 3 years ago

We should not encourage anybody to call deepcopy. It seems like the right thing if you're just experimenting around with nested arrays and such, but it is almost never the right choice in real code, and using it is a sign that the design and data model needs to be thought through more. So really there is only copy, and the way to think about it is that it operates only on the object you pass it. deepcopy is really a very different function; it almost belongs in the Serialization package or something of that sort.

Noob reporting in! Just out of curiosity, if I have a struct that is somewhat deep and complex but with no loopy data structure and I want to make a copy of it to have two separate copies to manipulate in different ways, if not just copying it with deepcopy, towards what kind of thinking would you point me?

timholy commented 3 years ago

You can create custom copy methods for specific types. There are cases where there is no sensible way to copy something without making modifications. For example,

struct MyStruct
     uuid::UUID   # the unique identifier for this particular instance
     ...
end

might need to specialize copy to generate a new uuid for the copy, plus a special == that ignores the uuid and just compares the rest of the fields. (=== would continue to distinguish an item from its copy.)

Whether it's OK to specialize copy to nested types is a little more confusing. Personally I think we're a bit inconsistent about the shallow issue, for example copy(::Expr) makes a deep, not shallow, copy: https://github.com/JuliaLang/julia/blob/c762f1050240aae2fa3fc90715c54daba8bb8b4c/base/expr.jl#L37-L64

kruxigt commented 3 years ago

Whether it's OK to specialize copy to nested types is a little more confusing. Personally I think we're a bit inconsistent about the shallow issue, for example copy(::Expr) makes a deep, not shallow, copy:

I'm generally a bit restrictive about creating versions of methods from Julia Base for my own types since I've noticed it can come back to bite me. Sure getindex, show and iterators etc are just too handy to leave on the table but in the case of a custom deepcopy for example I have chosen to give that a new name. But this is a more general discussion in Julia land i feel.

krcools commented 3 years ago

While on the topic, could you comment on whether or not the intended meaning of copy agrees with its use in

T = copy(transpose(A))

To me at least this is surprising; I expect copy to keep the object type invariant.