Open timholy opened 3 years ago
Perhaps adding a “See also: deepcopy
” reference (or an explanatory sentence for when to use it instead) to the docstring of copy
would help to address the main concern, namely that beginners don't discover deepcopy
?
I know, these doc-strings appear in the Base
reference manual right next to each other, but that is of no help to a user of the REPL or GUI help functions, which show these docstrings in isolation.
Perhaps even add an example?
julia> a = [1, [2,3]]; b = a; b[1]=4; b[2][1]=5; a
2-element Vector{Any}:
4
[5, 3]
julia> a = [1, [2,3]]; b = copy(a); b[1]=4; b[2][1]=5; a
2-element Vector{Any}:
1
[5, 3]
julia> a = [1, [2,3]]; b = deepcopy(a); b[1]=4; b[2][1]=5; a
2-element Vector{Any}:
1
[2, 3]
We should not encourage anybody to call deepcopy
. It seems like the right thing if you're just experimenting around with nested arrays and such, but it is almost never the right choice in real code, and using it is a sign that the design and data model needs to be thought through more. So really there is only copy
, and the way to think about it is that it operates only on the object you pass it. deepcopy
is really a very different function; it almost belongs in the Serialization package or something of that sort.
Given that this agrees with my own coding practice, can't say I disagree. It provokes the question of how widely it is used: on my ~/.julia/dev
directory (about 430 packages), I get
tim@diva:~/.julia/dev$ grep -Rc deepcopy | grep -v ":0$" | grep "test/" | wc
40 40 1528
tim@diva:~/.julia/dev$ grep -Rc deepcopy | grep -v ":0$" | grep "src/" | wc
90 90 3194
That's a very small fraction of all source files, but to my surprise it's more widely used in src/
code than test/
code (deepcopy
arguably has much more place in test/
than src/
). I guess most packages have more code in src
than test
so maybe that ratio isn't too crazy. But if you look up some of the individual files (just leave off the final | wc
) it's a bit surprising to see which ones they are (some pretty mainline packages as well as more niche ones).
The question remains, though: what more can we do to clarify how copy
behaves? The current docstring
https://github.com/JuliaLang/julia/blob/3d4b213a6d751e07fd9becbc17221868af638117/base/array.jl#L358-L366
probably can be called "non-newbie friendly" since it relies on understanding the implications of "shallow" and "identically-same." Those are great if you already understand the distinction between ==
and ===
or the jargon of shallow vs deep, but I can attest that's something that a lot of capable users from non-CS fields don't know coming into Julia. Of course, my proposal to rename this shallowcopy
is not successful in that either, other than directing attention to the fact that this may be more subtle than a totally-naive perspective might imagine.
Let me see if I can come up with something clearer (in a few days), and others can chime in.
I think the first sentence is fine, it's just the second that's not very exemple-y. Maybe just "For example, copy([[1]])[1][1] = 2
also modifies the original inner array"?
While we are improving the docstring of copy
, we should also do so for deepcopy
. Both docstrings currently fall well short of giving a precise and unambiguous specification of what exactly the user can expect these functions to do. For example, in the case of deepcopy
, there is no mention of what happens if the data structure is a DAG or even a graph with loops. Does deepcopy deduplicate references, or do DAGs get turned into trees and loops cause endless recursion? All legitimate questions that the docstring should provide clear answers for. The current docstrings still read a bit like temporary placeholders, to be replaced with the real documentation later.
Jargon such as “shallow” and “deep” should be introduced, not assumed. The documentation of these functions is also complicated by the fact that Julia's memory model is not at all obvious from its syntax, e.g. mutable and immutable structs look exactly the same except for a keyword, but may be treated quite differently in a “shallow” copy. I wonder if it wouldn't be useful to add a visualization tool that shows what copy
actually would copy, e.g. a macro to display a data structure that colors the “shallow” and “deep” information in it differently (a bit like the much loved @code_warntype
), to explain which parts get copied by assignment, by copy
, or only by deepcopy
.
copy([[1]])[1][1] = 2
You need to bind these values to names, such that you can reference them. Otherwise the example does not have any observable outcomes (and in fact the optimizer would be entitled to eliminate it). So you can't make that example much shorter than
julia> a=[[1]];copy(a)[1][1]=2;show(a)
[[2]]
While it doesn't yet address the docstring issue, I wrote a lengthy explanation in my course documents: https://github.com/timholy/AdvancedScientificComputing/blob/main/homeworks/learning_julia2.md#distinguishing-identity-and-equality-and-containers-and-their-collection-of-items
I'd be happy to move that to Julia's docs somewhere if that seems desirable. Maybe the FAQ? Do we want an "Explanations" section that covers basic concepts of computing using Julia as the language? As I start thinking about teaching Julia even to bare-beginners, referencing C documentation or even Python to explain these core concepts starts becoming less and less satisfying. OTOH, one might say that this material should really just go in textbooks and doesn't belong anywhere in this repo.
We should not encourage anybody to call
deepcopy
. It seems like the right thing if you're just experimenting around with nested arrays and such, but it is almost never the right choice in real code, and using it is a sign that the design and data model needs to be thought through more. So really there is onlycopy
, and the way to think about it is that it operates only on the object you pass it.deepcopy
is really a very different function; it almost belongs in the Serialization package or something of that sort.
Noob reporting in! Just out of curiosity, if I have a struct that is somewhat deep and complex but with no loopy data structure and I want to make a copy of it to have two separate copies to manipulate in different ways, if not just copying it with deepcopy, towards what kind of thinking would you point me?
You can create custom copy
methods for specific types. There are cases where there is no sensible way to copy something without making modifications. For example,
struct MyStruct
uuid::UUID # the unique identifier for this particular instance
...
end
might need to specialize copy
to generate a new uuid
for the copy, plus a special ==
that ignores the uuid
and just compares the rest of the fields. (===
would continue to distinguish an item from its copy.)
Whether it's OK to specialize copy to nested types is a little more confusing. Personally I think we're a bit inconsistent about the shallow issue, for example copy(::Expr)
makes a deep, not shallow, copy: https://github.com/JuliaLang/julia/blob/c762f1050240aae2fa3fc90715c54daba8bb8b4c/base/expr.jl#L37-L64
Whether it's OK to specialize copy to nested types is a little more confusing. Personally I think we're a bit inconsistent about the shallow issue, for example
copy(::Expr)
makes a deep, not shallow, copy:
I'm generally a bit restrictive about creating versions of methods from Julia Base for my own types since I've noticed it can come back to bite me. Sure getindex, show and iterators etc are just too handy to leave on the table but in the case of a custom deepcopy for example I have chosen to give that a new name. But this is a more general discussion in Julia land i feel.
While on the topic, could you comment on whether or not the intended meaning of copy agrees with its use in
T = copy(transpose(A))
To me at least this is surprising; I expect copy to keep the object type invariant.
This is a potential Julia 2.0 change that arose while teaching newbies. One student got bit by
copy
's shallow behavior; this is also a common issue on discourse and stackoverflow. (Those are examples, and there are duplicates.) A potential solution is to not havecopy
at all, instead offering a choice betweenshallowcopy
ordeepcopy
.This has downsides: (1)
shallowcopy
is longer to type, (2) other languages (e.g., Python) make the same naming choices we're currently making, and (3) it might cause more people to usedeepcopy
out of safety in cases where it's not actually necessary (deepcopy
is much slower).The main upside is that by naming it
shallowcopy
we encourage people to understand what it's doing and thus to think about whether they want shallow or deep in this particular instance. Bugs arising from a shallow copy can be quite difficult to diagnose, typically costing orders-of-magnitude more time than typing a few extra characters.A downside-to-the-upside is that this isn't unique to
copy
:fill
, pass-by-reference, and other circumstances can trigger what is essentially the same confusion, and I don't see as simple a solution for these.