Closed StefanKarpinski closed 11 years ago
I was unaware of the importance of this until Stefan pointed this out to me:
julia> a = Complex(1,1)
1 + 1im
julia> b = a
1 + 1im
julia> a.re = 2
2 + 1im
julia> a
2 + 1im
julia> b
2 + 1im
Note: I proposed that Immutable could be an abstract type in this email thread. Jeff seems hesitant about this idea and no conclusion was reached.
I have a plan. Complex numbers are important for 1.0; we especially need them for FFTs and the LAPACK interface. And we need to at least match the capabilities of matlab, octave, fortran, etc. Therefore I'm going to try to define Complex64 and Complex128 as bits types, which will give us the packed arrays, immutability, and performance we need. Complex{Foo} can wait for 2.0.
Ok, that's definitely a plan. I think the packed arrays are much more important than the immutability. For v1.0 we can just say "don't mutate complex numbers" and leave it at that. Any chance we can do it that way?
Yeah, this way complex will work like other numeric types, in julia as well as other languages. You won't be able to mutate it or access fields with dot syntax. And we can already stack allocate bits types, so we could get some great performance. Only downside is you can't have complex with arbitrary component types, but that doesn't have many use cases. But, it's easy to define new complex types in julia.
Can we can keep Complex{T} and have Complex64 and Complex128 as special types that implement the same basic functionality but more efficiently? Also, should those be called Complex32 and Complex64 instead? I know they have twice that many bits, but you could argue that the naming is after the size of each part, not the total.
Yes, we can keep it (maybe at the cost of some confusion), and the complex math routines can be made generic enough. I think the names should refer to the total size of the type, like complex*16 in fortran.
I feel there is a need for this, if only to restrict hashing to immutable values in the same way that Python does.
Take this example:
julia> a = [1]; b = [2]; c = {a=>"one", b=>"two"}; a[1] = 2; b[1] = 1; c
{[1]=>"two",[2]=>"one",}
julia> c[[1]]
key not found: [1]
in ref, /home/lachlan/julia/j/table.j:19
julia> c[a]
key not found: [2]
in ref, /home/lachlan/julia/j/table.j:19
julia> c[[1]] = "ONE"; c[[2]] = "TWO"; c
{[1]=>"two",[2]=>"TWO",[2]=>"one",[1]=>"ONE",}
If the hash function is only defined for immutable types by default, then potentially confusing/error-prone situations like this could be avoided. I would be forced to use tuples instead of arrays for my hashtable keys.
Apologies in advance if I am joining an already decided conversation. I can't see the linked discussion threads in the julia-math Google Group as I am not a member.
Yes, technically you are right on this. I'm familiar with http://home.pipeline.com/~hbaker1/ObjectIdentity.html.
Computing hash values of arrays, and comparing arrays both make sense, but not using them as table keys. In fact the only use that comes to mind is memoizing certain kinds of functions. But if a memoizer copied arrays before using them as keys, it would be safe in practice. Disallowing that seems too strict.
There's also issue #359, which proposes using const
fields to express fine-grained immutability. I fully immutable object would be one in which all fields are const
. I think that copying non-const objects upon insertion into a hash is a pretty reasonable way to go. Of course, you can still iterate the keys and then modify the copied key, which might be even more confusing. In order to really solve this, we would need immutable versions of all types :-/
I believe that immutable data structures are very important for writing robust code and become even more important when parallel programming is taken into account. The functional programming community has made this point quite clearly.
What I think a modern programming language should have is the possibility to define fully immutable data structures, including immutable container types, with the possibility to test at compile time (through the type system) and at runtime if a given piece of data is immutable. Immutable container types can be implemented efficiently using persistent data structures (check out the Clojure language (http://clojure.org/) for a real-life implementation). A fully immutable data structure is an immutable whose fields (for structure-like types) and/or elements (for container types) are immutable as well.
The advantage of immutable data structures is that they can be safely passed around between functions and concurrent tasks, stored in files or sent over the network, without anyone being able to mess them up, not even accidentally. For an extended argumentation, see http://doi.ieeecomputersociety.org/10.1109/MCSE.2012.11 (ask me for a copy if you can't access it).
Hey there, stumbled across this exact issue...... Perhaps I don't quite understand the bitstype hierarchy of complex numbers, but I can still do this:
julia> a = 1 + 1im
1 + 1im
julia> b = a
1 + 1im
julia> a.re = 2
2
julia> b
2 + 1im
The Complex64/128
bitstypes don't have an re
field, so this problem is dodged. Why does 1 + 1im
yield a ComplexPair
instead of a Complex64/128
?
@staticfloat Perhaps this will help:
julia> super(Complex64)
Complex{Float32}
julia> typeof(1+1im)
ComplexPair{Int64}
julia> typeof(1.+1im) #note the extra dot that promotes this to float arithmetic
Complex128
In summary, Complex64/Complex128 are Float types. There is no currently available analog for Complex integers, except for the fallback ComplexPair type.
Ahhh, I get it. Complex64
is really a Floating-point type... And ComplexPair
is the fallback where you just have two arbitrary datatypes (such as integer) in real-imaginary pair, and this issue is tracking making compound types like ComplexPair
immutable. Thanks, @vtjnash.
I just wanted to document the current thinking that Jeff and I cooked up when discussing this issue recently. The type
keyword will introduce a new immutable composite type, and reftype
will be the keyword for introducing a new mutable composite type. The semantics are that reftypes have a well-defined location and what you're passing around is really just a reference to that single location, hence the name. Normal, immutable types, on the other hand, won't have a well-defined notion of location – they are just values, and the compiler can move or copy them freely. The naming is intended to make people think of values as the normal kind of type, as they should be for most things. There will be no way to make just some of the fields of a type mutable. Instead, you can just have an immutable value that has a field that's of a mutable type. This reduces the number of features and will probably force better design.
Sounds good, but i'd rather we leave type
alone, and call immutable types value
. Most users are going to expect types to work like structs.
We kind of want to force people to use immutable value types as much as possible. Basically unless you actually have a need for something that has a well-defined location, you should be using an immutable value type instead. This is the sequence I'm hoping for:
If type
was mutable, people would just go ahead using those and never be enlightened since none of this would ever happen and we'd have to keep prosthelytizing futilely for immutability for the rest of our lives.
That's crazy. If immutable types are useful, people will use them. This is the sequence i fear:
type
I'm not overly concerned about that. We've been advertising that we're going to do this forever :-)
Can I ask for some clarification? Let's consider a Range{Int}
, which I'm guessing will be a candidate for being an immutable type. Let's also suppose A
is a matrix (two-dimensional). I can say r = 1:stride(A,2):length(A)
and then use r
to iterate along the first row. Once I've done the first row, currently I can move on to the second row with r.start += 1
, and I don't have to copy the step
or len
entries. IIUC, making Range{Int}
immutable will break this behavior, right? That is, having the type be immutable is a promise to the compiler that none of the fields will change their values.
Or, is it less stringent than that? Is "immutable" basically the promise that a CompositeKind can be effectively represented by a single pointer, and (as in C) p->a
is always at a defined (compiler-known) offset relative to *p
, so that you can avoid the need for a lookup table of field pointers for each instance of an object? (And sizeof
will work for it.) That is, for these two types,
type Type1
a::Int64
b::Float64
end
type Type2
a
b::Float64
end
the first could be an immutable type with no change in behavior (i.e., I'd still be allowed to say t1.a = 5
), but the second would not be "immutizable" because you don't know anything about the type of a
and it indeed might change on you.
Or none of the above?
I'm just trying to understand; ATM I'm not saying that the first "more stringent" is horrible, because (taking the Range
example) I can imagine there might be compiler optimizations possible in an inner loop that knows that the step
is a constant. So I'd get an overall performance boost even if it feels wasteful to create a whole new Range
value for each row. But it certainly would break more code, and therefore might lead to some distress.
Yes, it's the strict version. If Range is an immutable type then doing r.start += 1
is an error. Which I think it should be because that's a pretty weird thing to do. (We aren't doing that in our code somewhere, are we?)
Changing type
at this point feels gratuitous to me too. Surely educating/convincing people about immutability does not simply hinge on the choice of keyword.
@timholy it's the most stringent version. But what happens is that we can then turn "creating a whole new Range" into not actually allocating a new object. Ideally code that used to do r.start+=1
won't get any slower (when rewritten not to use mutation), plus code that used to repeatedly allocate will get faster.
I should point out that LLVM (like other compilers) is really good at optimizing away copies of immutable values. When you do something which, in principle, copies a Range object with one field modified, it will very often actually just be modified in place. Even more importantly, it is quite possible that the Range object doesn't actually live anywhere in memory – there is no contiguous Range object allocated, but rather pieces of it can just be stored in registers and never saved to memory at all. You can't do this (easily) with mutable values since they need to live somewhere in memory in order to support mutable semantics.
So valtype
? immutable
? Using value
as a keyword seems like a bad idea.
Thanks for the explanations. If type
doesn't change it's meaning, then I think there's nothing of concern to discuss. Carry on.
@StefanKarpinski, I don't know of a place where it is happening in Julia's main repository now. It's more of a C-like thing to do: I've got one example like that in my Grid module (to be released once I get to the "finish line" with HDF5, aaargh), where a Range
is basically being used as a cleaner interface to Blas.axpy!
. But it's not a necessary interface, and there might even be some small advantages in using axpy
's raw (but not pretty) interface. So it's not a concern. Now I know not to bother with such constructs.
I did just add a valtype
function to dict.jl
. Feel free to change if you want to co-opt it as a keyword.
I was looking at the range code recently, and it is mentioned in the comments that ranges are meant to be immutable. Of course, the comment doesn't actually prevent the type from being mutated. ;-)
Question: are there plans (tentative or otherwise) for immutable container types as well?
I, for one, welcome our new non-mutating type overlords... err, I mean, immutable types look interesting.
immutable
sounds like the best name to me. Like @timholy, I don't have much intuition for how this will actually work. So I'll be one of the people you'll be educating about the virtues of immutability.
Another thought: I'm guessing this means that immutable types are not, as I once thought, part of the plan for a built-in, zero-copy version of strpack. Assuming this is something Julia will eventually have, instead you're planning to implement this functionality by enhancing mutable types. However, it will be an error if you try this on a type for which the memory layout is not uniquely specified by the type declaration. Correct?
The longer something is to type, the less people will want to use it frequently. type
is nice. valtype
is okay. immutable
might be a bit long.
I'll actually vote for valtype
, since it shows a direct relationship to type
. The struggle here is getting users to think of values as immutable.
As an aside, I really like the idea that julia is borrowing so many good ideas from other languages. The idea of mixing immutable and mutable types in a language (and strongly encouraging immutable use) was one of the things I liked about scala early on, before it got (or before I understood it to be) all convoluted(*). After this, all I'd like to see is better pattern matching.
() scala wasn't the first to do this (), of course. () by "this", I mean, "mix immutable and mutable types", not "get all convoluted". Many languages (cough *R cough) have done that.
Ok, done :)
PARTY TIME!!1!¡!
I am really looking forward to this. So much so that I wonder if I'll have to talk myself into supporting 0.1 in Images.jl...
Any noticeable bumps in performance on benchmarks? Or is it too early to see many of the benefits?
Nice job, Jeff!
This makes me feel like coding :)
The conclusion of our last discussion on immutability versus mutability was that it is largely a psychological distinction:
To declare that a new concrete type is immutable, prefix its declaration with immutable:
Alternately, only the immutable keyword could be used: