zero out memory of uninitialized fields

StefanKarpinski commented 10 years ago

Raised here: https://groups.google.com/forum/#!topic/julia-users/O5S8pPav5Ks.

eschnett commented 10 years ago

Instead of initializing them to zero, we could initialize them to a large, nonsensical value to help catch access of uninitialized fields.

StefanKarpinski commented 10 years ago

I guess, but this only applies to pure data types like Int or Float64. Frankly, I'm still not entirely convinced that this is a real usability problem, but unlike arrays, there's no good performance reason not to guarantee zeroed memory.

ViralBShah commented 10 years ago

The zeroing could take a lot of time if you are inserting lots of things into a large tree or a list, or even an array of composite types.

JeffBezanson commented 10 years ago

Like Stefan, I'm not too worried about performance here. One reason is that uninitialized object fields are relatively rare. new calls that pass all arguments would be unaffected. LLVM might also be able to remove the extra store in x = new(); x.fld = 1. And if the object is heap allocated, the overhead of an extra store would be comparatively small.

One corner case that could cause problems is uninitialized bits fields in immutable types. It's a gotcha if they are zeroed when allocating individual objects, but not when allocating an array of them. Right now we consistently say "bits aren't automatically zeroed". If you like automatic zeroing, you want it everywhere, and doing it sometimes is arguably worse than doing it never.

One way out of that corner case is to disallow explicitly-uninitialized bits fields in immutables. Uninitialized references in immutables have uses (e.g. Nullable{BigMutableThing}), but uninitialized bits fields are less reasonable.

Frankly, I'd rather leave it as-is, or zero everything. For small arrays we can just pay the price, and allocate big arrays directly with mmap. Might not be so bad.

eschnett commented 10 years ago

I'd be in favour of initializing everything. If this turns out to be a bottleneck, as measured in a validated benchmark, then we can see whether introducing ccall(:jl_allocate_uninitialized_array) for a few special cases wouldn't do the trick.

stevengj commented 10 years ago

Regarding zero-ing everything, it wouldn't be too hard to change our malloc_a16 function (in gc.c, which is used to allocate arrays) to a calloc_a16 function, which called calloc, shifts the pointer, and stores the original pointer before the pointed-to data. This is how the _aligned_malloc function works on Windows, and how we defined a 16-byte (or 32-byte) aligned malloc for FFTW (which is so trivial I don't think relicensing would be an issue).

ViralBShah commented 10 years ago

I would prefer zeroing everything rather only in some places - or use a specific byte pattern. I guess we can start by using calloc and validate through a benchmark as suggested.

ViralBShah commented 10 years ago

Also this is something we can presumably backport to 0.3 for Ron's class, if it all works out.

tknopp commented 10 years ago

Which would be a pretty drastic semantic change within one version number...

Regarding the actual issue: I think it is a good idea to initialize with zero if the performance degradation are negligible. Would still be good to have a flag to get the malloced array if desired.

StefanKarpinski commented 10 years ago

Which would be a pretty drastic semantic change within one version number...

It's a safe change though since this is not a behavior anyone could reasonably rely on.

stevengj commented 10 years ago

(@tknopp, you can always call pointer_to_array(convert(Ptr{T}, c_malloc(sizeof(T) * n)), n, true) or similar to get a malloced array, so I don't think we necessarily need a flag. Assuming the overhead of calloc is normally negligible, anyone needing an uninitialized array will be working at such a low level that calling c_malloc won't be unreasonable.)

I tend to agree that people shouldn't rely on this behavior and it probably shouldn't even be documented; they should use zeros(...) if they want guaranteed zero initialization. (Of course, the implementation of zeros in Base can take advantage of it.)

tknopp commented 10 years ago

@StefanKarpinski: Indeed. Still, I am not sure if backporting features or semantic language changes is a good idea. Its hard to keep track in which version the feature gets in. Or one might even have to distinguish minor version numbers (e.g. 0.33 and 0.34) when a new feature gets in in 0.34. This then has impact for all packages...

@stevengj: While I use ones and zeros myself when initializing an array I think the Array constructor should be a valid way to initialize an array. Currently I am not using it because I want zero initialization. If the constructor would initialize with zero, it would be IMHO the more logical way to create an array. For every other datastructure I also use the constructor.

stevengj commented 10 years ago

@tknopp, I'm not saying you shouldn't use the constructor. I'm saying that if a calloc version is fast enough then we need not provide a high-level uninitialized-array constructor (nor "a flag" for this).

stevengj commented 10 years ago

I made an experimental branch that uses calloc instead of malloc, and so far I haven't been able to detect any performance differences (all the differences are swamped by the measurement noise) on MacOS.

StefanKarpinski commented 10 years ago

Interesting and tangentially related: http://research.swtch.com/sparse

RauliRuohonen commented 9 years ago

Do you want users rely on zero initialization? If yes, best implement and document it so everyone's on the same page. If no, use some nonzero filler like 0b10101010 or just leave it uninitialized like it is today. Facts of life: if you implement zero initialization, users will rely on it, documented or not, whether you want them to or not. Either way, there should be some easy way to get uninitialized memory, like e.g. NumPy has empty() in addition to zeros() and ones() which you can use when you want performance.

sbromberger commented 9 years ago

@RauliRuohonen in the absence of explicit documentation to the contrary (and even then, not guaranteed), users will default to assuming zero initialization. This is the case in Graphs.jl, where dijkstra_shortest_paths can return uninitialized memory (see https://github.com/JuliaLang/Graphs.jl/issues/140 for an example).

This newbie's vote is for zero-by-default, and the sooner it's implemented, the better.

ViralBShah commented 9 years ago

I personally would prefer a byte pattern if we were to do this.

Also it is quite safe to do this by default, and in the few performance sensitive cases, have a way to get uninitialized memory.

sbromberger commented 9 years ago

I personally would prefer a byte pattern if we were to do this.

I'm genuinely curious - why would a byte pattern be preferable to zeros, especially when new pages are supposedly zeroed by the OS by default?

JeffBezanson commented 9 years ago

A byte pattern makes it easier to find uses of uninitialized values. The implication is that people must make sure to manually initialize everything, or else they will get some big useless value which at least makes it easier to find the bug.

However, this strikes me as going out of our way to slap people on the wrist. If we are going to put in the effort to guarantee initialization, I'd rather do people a favor and initialize with a likely-useful value (zero). You'd never need to write Foo(x,y) = new(0,0). And given calloc, there might be a performance advantage.

sbromberger commented 9 years ago

they will get some big useless value which at least makes it easier to find the bug

Or, in a worst case, they will get a big useless value that is close enough to an expected value that it slips through, and causes some catastrophic failure down the line?

Unless Julia's going to explicitly test and warn on uninitialized values using this byte pattern (thereby voiding any legitimate uses of that particular pattern), I don't see the advantage - and I see two disadvantages: 1), as you said, calloc() provides an optimized zero, and writing a specific byte pattern might result in poorer peformance; and 2) the principle of "do[ing] what is expected" seems to favor zeros.

StefanKarpinski commented 9 years ago

I think that either doing what we do now or initializing with zeros and having that be a specified, reliable behavior are the two best options.

johnmyleswhite commented 9 years ago

I think initializing to zeros is really the way to go unless there's a serious performance cost. It simplifies the mental model of how memory allocation works and provides a lot more security.

sbromberger commented 9 years ago

Proposal: zero-fill by default; provide a named parameter for an option to use "raw" malloc for when performance is über-critical.

The security issue is nothing to sneeze at, especially, for example, when building out web services with authenticated sessions. Also, it would make auditing things like Crypto.jl that much more complex.

vtjnash commented 9 years ago

fwiw, we appear to have some bugs in pcre.jl related to the unintentional use of undefined values from an Array(Ptr{T}, x), but zeros(Ptr{T}, x) doesn't work anymore (it's deprecated)

ivarne commented 9 years ago

Hmm.. That seems like a reasonable usage of zeros(Ptr{T}, x). Maybe we should have changed the documentation, instead of depreciating the method?

vtjnash commented 9 years ago

and there's another one in socket.jl (I changed some local behavior of ccall that is causing these to become more visible, as segfaults)

ivarne commented 9 years ago

Thinking more about this, I have come to believe that not zeroing the memory in Array(Ptr{T}, x) is a mistake, and should rather be fixed in the array constructor than in in a separate zeros method.

nalimilan commented 9 years ago

I think zero(::Ptr) and thus zeros(::Array{Ptr}) were not considered as correct because C_NULL is not the additive identity for pointers.

andreasnoack commented 9 years ago

What about using fill(Ptr{T}(0), n) here?

2015-01-24 10:15 GMT-05:00 Milan Bouchet-Valat notifications@github.com:

I think zero(::Ptr) and thus zeros(::Array{Ptr}) were not considered as correct because C_NULL is not the additive identity for pointers.

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/9147#issuecomment-71321615.

eschnett commented 9 years ago

The consensus that seems to form is that initializing newly allocated memory should be the default, for security reasons. (There should also be a sufficiently-obscure escape hatch for well-tested low-level library functions.) There's nothing wrong with zeros or fill, but rather with the Array constructor: It should choose safety by default.

johnmyleswhite commented 9 years ago

+1

I even agree with @stevengj's original point: those who need malloc dirty memory can just use ccall.

andreasnoack commented 9 years ago

@eschnett My point was not about the behavior of Array, but about zeros. I don't think zeros should be used when constructing arrays of null pointers. Other people are better to decide on the default behavior of Array.

eschnett commented 9 years ago

@andreasnoack I agree.

JeffBezanson commented 9 years ago

fill(C_NULL, n) would be my favorite way to get an array of null pointers. But yes, Array should zero-fill as well.

sbromberger commented 9 years ago

Just checking: has this been implemented recently? I notice that newly-allocated arrays (from yesterday's master) are all getting zeros:

julia> a = Array(Float64,(6,6))
6x6 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0

or very close to it:

julia> a = Array(Float64,(6,6))
6x6 Array{Float64,2}:
 0.0           9.88131e-324  1.4822e-323   1.97626e-323  2.96439e-323  3.45846e-323
 4.94066e-324  1.4822e-323   1.4822e-323   1.97626e-323  2.96439e-323  3.45846e-323
 9.88131e-324  9.88131e-324  1.4822e-323   2.47033e-323  2.96439e-323  3.95253e-323
 1.4822e-323   9.88131e-324  1.4822e-323   2.47033e-323  2.96439e-323  3.45846e-323
 4.94066e-324  9.88131e-324  1.97626e-323  2.47033e-323  3.45846e-323  3.95253e-323
 4.94066e-324  1.4822e-323   1.97626e-323  2.47033e-323  3.45846e-323  3.95253e-323

JeffBezanson commented 9 years ago

No, not implemented yet. Close doesn't count! Very often you'll get zeros purely by accident since new pages from the OS are zero'd already.

sbromberger commented 9 years ago

Ah, ok. Thanks for the update :) It was weird that I was getting values < eps(Float64).

sbromberger commented 9 years ago

Hi all,

Given the proposed feature freeze (https://groups.google.com/d/msg/julia-dev/s2-Zj3acL_g/Nw7MV8MT3QwJ), could I suggest that we get this in prior to 0.4? Thanks.

vtjnash commented 9 years ago

i've put a milestone target on this so it doesn't get lost. if there's a PR sooner, then I don't see why it couldn't be added to v0.4 (or perhaps even v0.4.x)

sbromberger commented 9 years ago

@vtjnash @stevengj

https://github.com/JuliaLang/julia/issues/9147#issuecomment-64924076

I made an experimental branch that uses calloc instead of malloc, and so far I haven't been able to detect any performance differences (all the differences are swamped by the measurement noise) on MacOS.

Is this branch still available? If so, could it form the basis of the PR?

vtjnash commented 9 years ago

i had looked that over briefly. i think it was based on the old GC and was more a proof of concept than a full analysis and implementation.

JeffBezanson commented 9 years ago

I'm willing to put this in 0.4 if a PR arises.

ViralBShah commented 9 years ago

I probably don't understand all the cases well enough, but is using calloc sufficient? I thought that memory that gets GC'ed will also need reinitialization. Stating the obvious, but it seems like if we do this, we should do it across the board.

JeffBezanson commented 9 years ago

The change shouldn't be too bad:

use calloc instead of malloc where necessary
add extra zero stores to emit_new_struct
add zero stores and memsets in a couple places in alloc.c and array.c

ScottPJones commented 9 years ago

:+1: to doing this for 0.4, but maybe just for scalar values... I'd be concerned about large arrays, esp. when they get totally filled up immediately after getting allocated (as in the string conversion functions).

ScottPJones commented 9 years ago

Ah, spoke too soon :sad: @sbromberger 's idea is exactly what I'd want... some way of telling Julia that the Array{Uint8,1)(100000000) I just allocated is "raw".

stevengj commented 9 years ago

@ScottPJones, for large arrays, I couldn't measure any performance penalty to calloc (any difference from malloc was in the noise). As I understand it, modern calloc implementations don't actually memset the memory to zero, they just generate copy-on-write references to a special pre-allocated page of zeros. Do you have any data to indicate there is ever a significant penalty on modern systems?

ScottPJones commented 9 years ago

@stevengj My data was probably seriously out of date :wink: I'd benchmarked exactly this issue over 20 years ago... had serious slowdowns when doing large allocations... Keyword in what you said is "modern"... If that's going to be the case even for smaller array allocations (say for a 512 byte-64K string), I'm happy. [too much of my experience goes back to the dawn of time... it's always useful to retest your performance assumptions every few years...]

garrison commented 9 years ago

Related thought: What about calling resize! on a Vector? Currently this appends uninitialized values if needed, but if the current issue is fixed as planned I think it might make more sense for it to append zeros (on types where that is possible).

JuliaLang / julia

zero out memory of uninitialized fields #9147