Open StefanKarpinski opened 10 years ago
Instead of initializing them to zero, we could initialize them to a large, nonsensical value to help catch access of uninitialized fields.
I guess, but this only applies to pure data types like Int or Float64. Frankly, I'm still not entirely convinced that this is a real usability problem, but unlike arrays, there's no good performance reason not to guarantee zeroed memory.
The zeroing could take a lot of time if you are inserting lots of things into a large tree or a list, or even an array of composite types.
Like Stefan, I'm not too worried about performance here. One reason is that uninitialized object fields are relatively rare. new
calls that pass all arguments would be unaffected. LLVM might also be able to remove the extra store in x = new(); x.fld = 1
. And if the object is heap allocated, the overhead of an extra store would be comparatively small.
One corner case that could cause problems is uninitialized bits fields in immutable types. It's a gotcha if they are zeroed when allocating individual objects, but not when allocating an array of them. Right now we consistently say "bits aren't automatically zeroed". If you like automatic zeroing, you want it everywhere, and doing it sometimes is arguably worse than doing it never.
One way out of that corner case is to disallow explicitly-uninitialized bits fields in immutables. Uninitialized references in immutables have uses (e.g. Nullable{BigMutableThing}
), but uninitialized bits fields are less reasonable.
Frankly, I'd rather leave it as-is, or zero everything. For small arrays we can just pay the price, and allocate big arrays directly with mmap. Might not be so bad.
I'd be in favour of initializing everything. If this turns out to be a bottleneck, as measured in a validated benchmark, then we can see whether introducing ccall(:jl_allocate_uninitialized_array)
for a few special cases wouldn't do the trick.
Regarding zero-ing everything, it wouldn't be too hard to change our malloc_a16
function (in gc.c
, which is used to allocate arrays) to a calloc_a16
function, which called calloc
, shifts the pointer, and stores the original pointer before the pointed-to data. This is how the _aligned_malloc
function works on Windows, and how we defined a 16-byte (or 32-byte) aligned malloc
for FFTW (which is so trivial I don't think relicensing would be an issue).
I would prefer zeroing everything rather only in some places - or use a specific byte pattern. I guess we can start by using calloc and validate through a benchmark as suggested.
Also this is something we can presumably backport to 0.3 for Ron's class, if it all works out.
Which would be a pretty drastic semantic change within one version number...
Regarding the actual issue: I think it is a good idea to initialize with zero if the performance degradation are negligible. Would still be good to have a flag to get the malloced array if desired.
Which would be a pretty drastic semantic change within one version number...
It's a safe change though since this is not a behavior anyone could reasonably rely on.
(@tknopp, you can always call pointer_to_array(convert(Ptr{T}, c_malloc(sizeof(T) * n)), n, true)
or similar to get a malloced
array, so I don't think we necessarily need a flag. Assuming the overhead of calloc
is normally negligible, anyone needing an uninitialized array will be working at such a low level that calling c_malloc
won't be unreasonable.)
I tend to agree that people shouldn't rely on this behavior and it probably shouldn't even be documented; they should use zeros(...)
if they want guaranteed zero initialization. (Of course, the implementation of zeros
in Base can take advantage of it.)
@StefanKarpinski: Indeed. Still, I am not sure if backporting features or semantic language changes is a good idea. Its hard to keep track in which version the feature gets in. Or one might even have to distinguish minor version numbers (e.g. 0.33 and 0.34) when a new feature gets in in 0.34. This then has impact for all packages...
@stevengj: While I use ones
and zeros
myself when initializing an array I think the Array constructor should be a valid way to initialize an array. Currently I am not using it because I want zero initialization. If the constructor would initialize with zero, it would be IMHO the more logical way to create an array. For every other datastructure I also use the constructor.
@tknopp, I'm not saying you shouldn't use the constructor. I'm saying that if a calloc
version is fast enough then we need not provide a high-level uninitialized-array constructor (nor "a flag" for this).
I made an experimental branch that uses calloc
instead of malloc
, and so far I haven't been able to detect any performance differences (all the differences are swamped by the measurement noise) on MacOS.
Interesting and tangentially related: http://research.swtch.com/sparse
Do you want users rely on zero initialization? If yes, best implement and document it so everyone's on the same page. If no, use some nonzero filler like 0b10101010 or just leave it uninitialized like it is today. Facts of life: if you implement zero initialization, users will rely on it, documented or not, whether you want them to or not. Either way, there should be some easy way to get uninitialized memory, like e.g. NumPy has empty()
in addition to zeros()
and ones()
which you can use when you want performance.
@RauliRuohonen in the absence of explicit documentation to the contrary (and even then, not guaranteed), users will default to assuming zero initialization. This is the case in Graphs.jl, where dijkstra_shortest_paths
can return uninitialized memory (see https://github.com/JuliaLang/Graphs.jl/issues/140 for an example).
This newbie's vote is for zero-by-default, and the sooner it's implemented, the better.
I personally would prefer a byte pattern if we were to do this.
Also it is quite safe to do this by default, and in the few performance sensitive cases, have a way to get uninitialized memory.
I personally would prefer a byte pattern if we were to do this.
I'm genuinely curious - why would a byte pattern be preferable to zeros, especially when new pages are supposedly zeroed by the OS by default?
A byte pattern makes it easier to find uses of uninitialized values. The implication is that people must make sure to manually initialize everything, or else they will get some big useless value which at least makes it easier to find the bug.
However, this strikes me as going out of our way to slap people on the wrist. If we are going to put in the effort to guarantee initialization, I'd rather do people a favor and initialize with a likely-useful value (zero). You'd never need to write Foo(x,y) = new(0,0)
. And given calloc
, there might be a performance advantage.
they will get some big useless value which at least makes it easier to find the bug
Or, in a worst case, they will get a big useless value that is close enough to an expected value that it slips through, and causes some catastrophic failure down the line?
Unless Julia's going to explicitly test and warn on uninitialized values using this byte pattern (thereby voiding any legitimate uses of that particular pattern), I don't see the advantage - and I see two disadvantages: 1), as you said, calloc() provides an optimized zero, and writing a specific byte pattern might result in poorer peformance; and 2) the principle of "do[ing] what is expected" seems to favor zeros.
I think that either doing what we do now or initializing with zeros and having that be a specified, reliable behavior are the two best options.
I think initializing to zeros is really the way to go unless there's a serious performance cost. It simplifies the mental model of how memory allocation works and provides a lot more security.
Proposal: zero-fill by default; provide a named parameter for an option to use "raw" malloc for when performance is über-critical.
The security issue is nothing to sneeze at, especially, for example, when building out web services with authenticated sessions. Also, it would make auditing things like Crypto.jl that much more complex.
fwiw, we appear to have some bugs in pcre.jl related to the unintentional use of undefined values from an Array(Ptr{T}, x)
, but zeros(Ptr{T}, x)
doesn't work anymore (it's deprecated)
Hmm.. That seems like a reasonable usage of zeros(Ptr{T}, x)
. Maybe we should have changed the documentation, instead of depreciating the method?
and there's another one in socket.jl (I changed some local behavior of ccall that is causing these to become more visible, as segfaults)
Thinking more about this, I have come to believe that not zeroing the memory in Array(Ptr{T}, x)
is a mistake, and should rather be fixed in the array constructor than in in a separate zeros
method.
I think zero(::Ptr)
and thus zeros(::Array{Ptr})
were not considered as correct because C_NULL
is not the additive identity for pointers.
What about using fill(Ptr{T}(0), n) here?
2015-01-24 10:15 GMT-05:00 Milan Bouchet-Valat notifications@github.com:
I think zero(::Ptr) and thus zeros(::Array{Ptr}) were not considered as correct because C_NULL is not the additive identity for pointers.
— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/9147#issuecomment-71321615.
The consensus that seems to form is that initializing newly allocated memory should be the default, for security reasons. (There should also be a sufficiently-obscure escape hatch for well-tested low-level library functions.) There's nothing wrong with zeros
or fill
, but rather with the Array
constructor: It should choose safety by default.
+1
I even agree with @stevengj's original point: those who need malloc
dirty memory can just use ccall
.
@eschnett My point was not about the behavior of Array
, but about zeros
. I don't think zeros
should be used when constructing arrays of null pointers. Other people are better to decide on the default behavior of Array
.
@andreasnoack I agree.
fill(C_NULL, n)
would be my favorite way to get an array of null pointers. But yes, Array
should zero-fill as well.
Just checking: has this been implemented recently? I notice that newly-allocated arrays (from yesterday's master) are all getting zeros:
julia> a = Array(Float64,(6,6))
6x6 Array{Float64,2}:
0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0
or very close to it:
julia> a = Array(Float64,(6,6))
6x6 Array{Float64,2}:
0.0 9.88131e-324 1.4822e-323 1.97626e-323 2.96439e-323 3.45846e-323
4.94066e-324 1.4822e-323 1.4822e-323 1.97626e-323 2.96439e-323 3.45846e-323
9.88131e-324 9.88131e-324 1.4822e-323 2.47033e-323 2.96439e-323 3.95253e-323
1.4822e-323 9.88131e-324 1.4822e-323 2.47033e-323 2.96439e-323 3.45846e-323
4.94066e-324 9.88131e-324 1.97626e-323 2.47033e-323 3.45846e-323 3.95253e-323
4.94066e-324 1.4822e-323 1.97626e-323 2.47033e-323 3.45846e-323 3.95253e-323
No, not implemented yet. Close doesn't count! Very often you'll get zeros purely by accident since new pages from the OS are zero'd already.
Ah, ok. Thanks for the update :) It was weird that I was getting values < eps(Float64).
Hi all,
Given the proposed feature freeze (https://groups.google.com/d/msg/julia-dev/s2-Zj3acL_g/Nw7MV8MT3QwJ), could I suggest that we get this in prior to 0.4? Thanks.
i've put a milestone target on this so it doesn't get lost. if there's a PR sooner, then I don't see why it couldn't be added to v0.4 (or perhaps even v0.4.x)
@vtjnash @stevengj
https://github.com/JuliaLang/julia/issues/9147#issuecomment-64924076
I made an experimental branch that uses calloc instead of malloc, and so far I haven't been able to detect any performance differences (all the differences are swamped by the measurement noise) on MacOS.
Is this branch still available? If so, could it form the basis of the PR?
i had looked that over briefly. i think it was based on the old GC and was more a proof of concept than a full analysis and implementation.
I'm willing to put this in 0.4 if a PR arises.
I probably don't understand all the cases well enough, but is using calloc sufficient? I thought that memory that gets GC'ed will also need reinitialization. Stating the obvious, but it seems like if we do this, we should do it across the board.
The change shouldn't be too bad:
:+1: to doing this for 0.4, but maybe just for scalar values... I'd be concerned about large arrays, esp. when they get totally filled up immediately after getting allocated (as in the string conversion functions).
Ah, spoke too soon :sad: @sbromberger 's idea is exactly what I'd want... some way of telling Julia that the Array{Uint8,1)(100000000) I just allocated is "raw".
@ScottPJones, for large arrays, I couldn't measure any performance penalty to calloc
(any difference from malloc
was in the noise). As I understand it, modern calloc
implementations don't actually memset
the memory to zero, they just generate copy-on-write references to a special pre-allocated page of zeros. Do you have any data to indicate there is ever a significant penalty on modern systems?
@stevengj My data was probably seriously out of date :wink: I'd benchmarked exactly this issue over 20 years ago... had serious slowdowns when doing large allocations... Keyword in what you said is "modern"... If that's going to be the case even for smaller array allocations (say for a 512 byte-64K string), I'm happy. [too much of my experience goes back to the dawn of time... it's always useful to retest your performance assumptions every few years...]
Related thought: What about calling resize!
on a Vector
? Currently this appends uninitialized values if needed, but if the current issue is fixed as planned I think it might make more sense for it to append zeros (on types where that is possible).
Raised here: https://groups.google.com/forum/#!topic/julia-users/O5S8pPav5Ks.