dotnet / csharplang

The official repo for the design of the C# programming language
11.37k stars 1.02k forks source link

Proposal: Immutable Types #2543

Open stephentoub opened 9 years ago

stephentoub commented 9 years ago

Problem

One of the uses of 'readonly' fields is in defining immutable types, types that once constructed cannot be visibly changed in any way. Such types often require significant diligence to implement, both from the developer and from code reviewers, because beyond 'readonly' there’s little assistance provided by the compiler in ensuring that a type is actually immutable. Additionally, there’s no way in the language (other than in type naming) for a developer to convey that a type was meant to be immutable, which can significantly impact how it’s consumed, e.g. whether a developer can freely share an instance of the type between multiple threads without concern for race conditions.

Consider this type:

public class Person
{
    public Person(string firstName, string lastName, DateTimeOffset birthDay)
    {
        FirstName = firstName;
        LastName = lastName;
        BirthDay = birthDay;
    }

    public string FirstName { get; }
    public string LastName { get; }
    public DateTime BirthDay { get; }

    public string FullName => $"{FirstName} {LastName}";
    public TimeSpan Age => DateTime.UtcNow – BirthDay;
}

Writing this type requires relatively minimal boilerplate. It also happens to be immutable: there are no exposed fields, there are no setters on any properties (the get-only auto-props will be backed by 'readonly' fields), all of the fields are of immutable types, etc. However, there is no way for the developer to actually express the intent to the compiler that an immutable type was desired here and thus get compiler checking to enforce this. At some point in the future, a developer could add a setter not realizing this type was meant to be immutable, and all of a sudden consumers of this type that were expecting full immutability (e.g. they'd avoiding making defensive copies) will now be very surprised:

public class Person
{
    public Person(string firstName, string lastName, DateTimeOffset birthDay)
    {
        FirstName = firstName;
        LastName = lastName;
        BirthDay = birthDay;
    }

    public string FirstName { get; }
    public string LastName { get; }
    public DateTime BirthDay { get; set; } // Oops!

    public string FullName => $"{FirstName} {LastName}";
    public TimeSpan Age => DateTime.UtcNow – BirthDay;
}

Similarly, the class could be augmented with an additional 'readonly' property but of a non-immutable type:

public class Person
{
    public Person(string firstName, string lastName, DateTimeOffset birthDay, Person[] ancestors)
    {
        FirstName = firstName;
        LastName = lastName;
        BirthDay = birthDay;
        Ancestors = ancestors;
    }

    public string FirstName { get; }
    public string LastName { get; }
    public DateTime BirthDay { get; }
    public Person[] Ancestors { get; }; // Oops!

    public string FullName => $"{FirstName} {LastName}";
    public TimeSpan Age => DateTime.UtcNow – BirthDay;
}

And so on. The developer has tried to design an immutable type, but without a way to declare that fact, and without compiler verification of that declaration, it is easy for bugs to slip in.

Solution: Immutable Types

We can introduce the notion of immutable types to C#. A type, either a class or a struct, can be annotated as "immutable":

public immutable class Person
{
    public Person(string firstName, string lastName, DateTimeOffset birthDay)
    {
        FirstName = firstName;
        LastName = lastName;
        BirthDay = birthDay; 
    }

    public string FirstName { get; }
    public string LastName { get; }
    public DateTime BirthDay { get; }

    public string FullName => $"{FirstName} {LastName}";
    public TimeSpan Age => DateTime.UtcNow – BirthDay;
}

When such an annotation is applied, the compiler validates that the type is indeed immutable. All fields are made implicitly readonly (though it’s ok for a developer to explicitly state the ‘readonly’ keyword if desired) and be of immutable types (all of the core types like Int32, Double, TimeSpan, String, and so on in the .NET Framework would be annotated as immutable). Additionally, the constructor of the type would be restricted in what it can do with the 'this' reference, limited only to directly reading and writing fields on the instance, e.g. it can’t call methods on 'this' (which could read the state of the immutable object before it was fully constructed and thus later perceive the immutable type as having changed), and it can’t pass 'this' out to other code (which could similar perceive the object changing). This includes being prohibited from capturing 'this' into an anonymous method in the ctor. A type being 'immutable' doesn't mean that its operations are pure, just that the state within the object can't observably change; an immutable type would still be able to access statics, could still mutate mutable objects passed into its methods, etc.

The 'immutable' keyword would also work as an annotation on generic types.

public immutable struct Tuple<T1, T2>
{
    public Tuple(T1 item1, T2 item2) { Item1 = item1; Item2 = item2; }
    public T1 Item1; // Implicitly readonly
    public T2 Item2; // Implicitly readonly
}

Applying 'immutable' to a type with generic parameters would enforce all of the aforementioned rules, except that the generic type parameters wouldn't be enforced to be immutable: after all, without constraints on the generic type parameters, there’d be no way for the implementation of the open generic to validate that the type parameters are immutable. As such, a generic type annotated as 'immutable' can be used to create both mutable and immutable instances: a generic instantiation is only considered to be immutable if it’s constructed with known immutable types:

void Usage<U>()
{
    Tuple<string, string>         local1; // considered immutable
    Tuple<int, string>            local2; // considered immutable
    Tuple<int, IC>                local3; // considered immutable
    Tupe<Tuple<int, string>, int> local4; // considered immutable
    Tuple<string, U>              local5; // considered mutable
    Tuple<C, C>                   local6; // considered mutable
    Tuple<Tuple<int, C>, int>     local6; // considered mutable
}
immutable class IC { }
class C { }

Such concrete instantiations could be used as fields of other immutable types iff they're immutable. But whether a generic instantiation is considered to be immutable or not has other effects on consumers of the type, for example being able to know that an instance is immutable and thus can be shared between threads freely without concern for race conditions. As such, the IDE should do the leg work for the developer and highlight whether a given generic instantiation is considered to be immutable or mutable (or unknown, in the case of open generics).

However, the immutability question also affects other places where the compiler needs to confirm that a type is in fact immutable. One such place would be with a new immutable generic constraint added to the language (there are conceivably additional places in the future that the language could depend on the immutability of a type). Consider this variation on the tuple type previously shown:

public immutable struct ImmutableTuple<T1, T2>(T1 item1, T2 item2) 
    where T1 : immutable
    where T2 : immutable
{
    public ImmutableTuple(T1 item1, T2 item2) { Item1 = item1; Item2 = item2; }
    public T1 Item1;
    public T2 Item2;
}

The only difference from the previous version (other than a name change for clarity) is that we’ve constrained both generic type parameters to be 'immutable'. With that, the compiler would enforce that all types used in generic instantiations of this type are 'immutable' and satisfy all of the aforementioned constraints.

void Usage<U>()
{
    ImmutableTuple<string, string>         local1; // Ok
    ImmutableTuple<int, string>            local2; // Ok
    ImmutableTuple<int, IC>                local3; // Ok
    ImmutableTupe<Tuple<int, string>, int> local4; // Ok
    ImmutableTuple<string, U>              local5; // Error: ‘U’ is not immutable
    ImmutableTuple<C, C>                   local6; // Error: ‘C’ is not immutable
    ImmutableTuple<Tuple<int, C>, int>     local6; // Error: ‘Tuple<int,C>’ is not immutable
}
immutable class IC { }
class C { }

With such constraints, it’s possible to create deeply immutable types, both non-generic and generic, and to have the compiler help fully validate the immutability.

However, there are times when you may want to cheat, where you want to be able to use the type to satisfy immutable constraints, and potentially have some of the type’s implementation checked for the rules of immutability, but where you need to break the rules in the implementation in a way that’s still observably immutable but not physically so. For example, consider building an ImmutableArray type that wraps an underlying array. As arrays are themselves mutable (code can freely write to an array’s elements), it’s not normally possible to store an array as a field of an immutable type:

public immutable class ImmutableArray<T>
{
    readonly T[] m_array; // Error: The types of fields in immutable types must be immutable
    …
}

To work around this, we can resort to unsafe code. Marking an immutable type as 'unsafe' would disable the rule checking for immutability in the entire type and put the onus back on the developer to ensure that the type really is observably immutable, while still allowing the type to be used in places that require immutable types, namely generic immutable constraints. Marking a field as unsafe would disable the rule checking only related to that field, and marking a method as unsafe would disable the rule checking only related to that method. A type that uses unsafe needs to ensure not only that it still puts forth an immutable facade, but that its internal implementation is safe to be used concurrently.

public immutable unsafe struct ImmutableArray<T>
{
    readonly T[] m_array; // Ok, but we’re now responsible again for ensuring immutability

    private ImmutableArray<T>(T[] array) { m_array = array; }

    private ImmutableArray<T>(T[] array, T nextItem) : this(new T[array.Length + 1])
    {
        Array.Copy(array, m_array, array.Length);
        m_array[array.Length] = nextItem;
    }

    public ImmutableArray<T> Add(T item) => new ImmutableArray<T>(m_array, item);

    public T this[int index] => m_array[index];
    public int Length => m_array.Length;
    ...
}

Delegates could also be marked as immutable, and a set of ImmutableAction and ImmutableFunc types would be included in the framework. As with other immutable types, all of the objects reachable from an immutable delegate instance would need to be immutable, which means that an immutable delegate could only bind to methods on immutable types. That in turn means that, when an anonymous method binds to an immutable delegate type, that anonymous method may only capture immutable state. Further, any locals captured into the lambda must either be from a 'readonly' value (#115) or must be captured by value (#117). This ensures that the fields of the display class can be 'readonly' and that the method which created the lambda can’t reassign the captured values after creating the lambda.

public void Run()
{
    readonly int local1 = …;
    int local2 = …;
    C local3 = …;

    ImmutableAction action1 = () => {
        Console.WriteLine(local1.ToString()); // Ok, captured readonly immutable
        Console.WriteLine(local2.ToString()); // Error: ‘local2’ must be captured by value
        Console.WriteLine(local3.ToString()); // Error: ‘local3’ is mutable
    };

    ImmutableAction action2 = [val local2]() => {
        Console.WriteLine(local2.ToString()); // Ok, captured non-readonly immutable by value
        local2 = 0;                           // Error: ‘local2’ is readonly
    };
}

Alternatives

The 'immutable' attribution would be deep, meaning that an instance of an immutable type and all of the types it recursively references in its state would be immutable. In contrast, we could consider a shallow version, with a 'readonly' attribute that could be applied to types. As with 'immutable', this would enforce that all fields were readonly. Unlike 'immutable', it would place no constraints on the types of those fields also being immutable.

scalablecory commented 9 years ago

General idea is a good one. I believe it will require CLR support to enforce across languages.

Can we rework how "unsafe" works? This keyword has specific connotations around memory safety that I'm not sure I like diluting. I also think it may be safer and better self-documenting if it is specified on specific fields. Something like:

immutable struct ImmutableArray<T>
{
    readonly mutable T[] array;
}

Though a "readonly mutable" sounds a little funny.

sharwell commented 9 years ago

All fields are made implicitly readonly

I would prefer a requirement that fields be marked as readonly.

Additionally, the constructor of the type would be restricted in what it can do with the 'this' reference, limited only to directly reading and writing fields on the instance

I would prefer this be a warning. While unlikely and generally not recommended, it's hard to state deterministically that no one will need to be able to write code like this.

sharwell commented 9 years ago

A Roslyn-based analyzer and a new [Immutable] attribute would support compile-time checking of this feature, provided certain types in core .NET assemblies where whitelisted in the implementation.

The biggest concern I would like to see resolved prior to implementing this is how we handle the builder pattern as cleanly (in my opinion) as the WithHeaders method I added here:

https://github.com/sharwell/openstack.net/commit/a09fc524ac19cf946c8cb16af030b730f439cf1d

For this implementation (specifically for StorageMetadata.WithHeadersImpl), I had to make two fields mutable even though the type remained immutable to the outside world.

svick commented 9 years ago

Additionally, the constructor of the type would be restricted in what it can do with the 'this' reference, limited only to directly reading and writing fields on the instance

I would prefer this be a warning. While unlikely and generally not recommended, it's hard to state deterministically that no one will need to be able to write code like this.

Maybe if you marked the constructor as unsafe, this kind of code could be allowed? (I also dislike using unsafe this way, but mutable on constructor would make even less sense.)

svick commented 9 years ago
public class Person
{
    public Person(string firstName, string lastName, DateTimeOffset birthDay)
    {
        FirstName = firstName;
        LastName = lastName;
        BirthDay = birthDay;
    }

    public string FirstName { get; } = firstName;
    public string LastName { get; } = lastName;
    public DateTime BirthDay { get; set; } = birthDay; // Oops!

    public string FullName => "\{FirstName} \{LastName}";
    public TimeSpan Age => DateTime.UtcNow – BirthDay;
}

In this and the following examples, you're using property initializers. I think they shouldn't be there.

stephentoub commented 9 years ago

Oops, thanks, @svick. Fixed.

stephentoub commented 9 years ago

@sharwell, that's true, but I'd previously written these examples using primary constructors, and without primary constructors, what I'd written didn't make sense. Instead I'm initializing those fields in the regular constructor.

stephentoub commented 9 years ago

@sharwell, strange, I could have sworn I was responding to a comment you'd written... it's almost as if it was there and then someone deleted it ;) Oh well.

sharwell commented 9 years ago

@stephentoub I'd love to get your feedback regarding the commit I mentioned above. In particular, the changes to StorageMetadata.cs and the new file StorageMetadataExtensions.cs.

The intent is for a user to be able to go var newItem = item.WithProperty(value), and have the static type of newItem match the static type of item, even if WithProperty is defining a property on one of that types base classes.

sharwell commented 9 years ago

@stephentoub I currently believe all of your proposed functionality can be provided by a combination of the following items:

  1. A new ImmutableAttribute type which can be applied to classes and structs.
  2. A diagnostic analyzer which ensures the conditions above are met.

Since the attribute is trivial, I'll explain my point regarding the analyzer.

The fun part is generic type constraints. As it turns out, you probably don't actually need them.

jaredpar commented 9 years ago

@sharwell

Sure, an analyzer could absolutely be used here to enforce these rules. In fact it can even be done as a unit test with reflection. I've actually written such code in past projects.

I don't think an analyzer is the right solution here though. Analyzers are great at enforcing a set of rules, or even to a degree a dialect of C#, within a single C# project. I control the compilation I can pick what analyzers I want to use.

Analyzers are less effective when there is a need to enforce rules across projects. In particular when those projects are owned by different people. There is no mechanism for enforcing that a given project reference was scanned by a particular analyzer. The only enforcement that exists is a hand shake agreement.

Immutable types is a feature though that requires cross project communication. The immutability of my type is predicated on the immutable of the type you control that I am embedding. If you break immutability in the next version you have broken my code. In my opinion hat kind of dependency is best done directly in the language.

Clockwork-Muse commented 9 years ago

Currently, when you access a readonly field in a method, it (in IL, mostly transparent to the developer) copies it to a local variable before doing anything with it. This constitutes a performance penalty. There's also this gem, buried in System.Collections.Immutable's array:

/// This type should be thread-safe. As a struct, it cannot protect its own fields
/// from being changed from one thread while its members are executing on other threads
/// because structs can change *in place* simply by reassigning the field containing
/// this struct. Therefore it is extremely important that
/// ** Every member should only dereference <c>this</c> ONCE. **
/// If a member needs to reference the array field, that counts as a dereference of <c>this</c>.
/// Calling other instance members (properties or methods) also counts as dereferencing <c>this</c>.
/// Any member that needs to use <c>this</c> more than once must instead
/// assign <c>this</c> to a local variable and use that for the rest of the code instead.
/// This effectively copies the one field in the struct to a local variable so that
/// it is insulated from other threads.

...which means currently implementing an "immutable" type is more subtle and fraught with danger than people may be aware (heisenbugs are likely to arise in client applications if they aren't aware of this issue).

I don't care too much about the particular end syntax used (whether we get a new keyword, or redefine the semantics of readonly, or whatever), but is the end result likely to fix these two issues?

  1. immutable fields shouldn't need to be copied before use, if the intent is that they can't be changed (transparently or otherwise).
  2. structs should receive equivalent protection to classes, so that this (and internal data!) can be safely referenced without issue...
MgSam commented 9 years ago

@stephentoub Have you seen Neal's proposal for pattern matching and records in C#? I think it addresses a lot of your concerns here.

stephentoub commented 9 years ago

@MgSam, thanks, yes, I have seen it.

sharwell commented 9 years ago

@jaredpar Overall I do agree with your comments. However, I also believe that some burden is placed on development teams to incorporate best practices described by libraries they depend on. There are all sorts of ways they can violate preconditions of libaries. One obvious example is Dictionary<TKey, TValue>, where post-conditions of methods and and invariants for instances can be violated by the implementation if the user does one of the following:

  1. Uses an IEqualityComparer<TKey> (including the default if not specified) which produces different hash codes for the same object.
  2. Performs operations from multiple threads, where at least one of those operations is a mutating operation on the data structure.

While the first item would be challenging to prevent, it would be easy to address the concerns for the second point by synchronizing concurrent access to the object.

If immutable types were provided via optional static analysis, it is conceivable that many users would be able consume them even without an analyzer because they are only prone to failure in very specific ways:

  1. Extending an immutable type and adding mutable properties.
  2. Supplying a mutable type for the generic argument to an immutable type, where that type argument becomes a property of the immutable object.

There are other ways to improve overall reliability, such as declaring a dependency on the analyzer package when creating a NuGet package for a library which defines immutable types. It would also be possible to package the ImmutableAttribute itself in the same NuGet package that provides the static analyzer.

For those wondering why I would push for an analyzer instead of a language change:

Concurrency remains a major challenge for modern application development. Synchronization constructs such as lock (or synchronized in Java) provide only limited solutions to these problems, especially when it comes to scale. One alternative approach is leveraging lock-free concurrent structures like those in System.Collections.Concurrent. Another approach is providing immutable data representations which are backed by data structures that support efficient transformation, such as those in System.Collections.Immutable.

For better or worse, these libraries do not provide out-of-the-box support for every scenario an application developer might encounter. Improving the ability of developers to extend these concepts will go a long way towards improving overall developer efficiency when creating reliable, scalable applications intended for concurrent environments.

In my opinion, the best approach would be to first implement this as an analyzer so people can start using it, and later consider incorporating it into the language and/or runtime. When you consider that several parts of the C# syntax, such as the out or params keywords applied to parameters, compile down to nothing more than applying an attribute (OutAttribute and ParamArrayAttribute in these cases), it's reasonable to think that a new immutable keyword for a class declaration could compile down to applying the ImmutableAttribute automatically. The only major change would be incorporating the analyzer into the standard compiler instead of distributing and enabling it separately.

jaredpar commented 9 years ago

The only major change would be incorporating the analyzer into the standard compiler instead of distributing and enabling it separately.

This is the crux of the issue for me. Immutable should mean immutable. I shouldn't have to think about it, be watching your commit history to make sure that you haven't change things. I should type immutable and get the expected behavior. Analyzers just don't provide that for me.

jods4 commented 9 years ago

The main issue with immutable types is how you create them. It's clumsy to efficiently create copies with multiple different fields, to create cycles in immutable object graphs and so on...

I propose the idea of mutation contexts:

  1. Immutable classes may have non-readonly fields. Setting a field outside of a mutation context is forbidden.
  2. Immutable classes may have property setters. A setter is a mutation context. Calling a setter from outside of a mutation context is forbidden.
  3. Immutable classes may have methods that are a mutation context. Calling such a method from outside a mutation context is forbidden. The syntax to designate such a method must be decided, but it could simply be an attribute.

    public immutable class C
    {
     public int X;
    
     [Mutates]
     public void Increment()
     {
       X += 1;  // Setting a field is OK because this method is a mutation context.
     }
    }
    
    new C().Increment(); // Error: Increment() can't be called from outside a mutation context.
  4. Functions can designate that they may mutate the state of one immutable parameter (syntax to be defined, maybe with an attribute). Such functions are a mutation context for the specified parameter and cannot be called from outside a mutation context.

    static class Utils
    {
     public static void Increment([Mutates] C c)
     {
       c.X += 1;  // OK because this method is a mutation context for c
     }
    }
    Utils.Increment(new C());  // Error: Utils.Increment can't be called from outside a mutation context.
  5. A new keyword mutate that has similar syntax as using creates a mutation context for one variable. This is the only way to establish a new mutation context and none of the other constructs 1-4 can be called outside of it. This is the really unsafe part of the code but it can be very useful. Typically, mutating a variable before you have "published" it for external readers is totally safe. Because an important use-case for immutability is concurrent programming, I suggest that a memory fence is added at the end of a mutate block. This will guarantee that once readers get the reference to the immutable instance, all its mutations are committed to memory and visible to all cores.
immutable class Person()
{
  public Person dad, mom;
  public Person[] children;  // This is an error, it should be ImmutableArray, I simplified a little bit.
}

static Person CreateSomeone()
{
  // This is dangerous but totally safe on non-shared new variables  
  mutate(var child = new Person())
  mutate(var dad = new Person())
  {
    // Could be inside the mutate as above, but to illustrate different syntax
    var mom = new Person();
    mutate(mom)
    {
      mom.children = dad.children = new[] { child };  // Should be ImmutableArray
      child.dad = dad;
      child.mom = mom;
      return child;
    }
  }
}

// It would be nice if we could also use Initializer syntax, either like this:
mutate(var x = new Person { dad = new Person(), mom = new Person() })
mutate(var d = x.dad, m = x.mom)  // Multiple values? probably not because inconsistent with using...
{
  d.children = m.children = new[] { x };
  return x;
}

// Or maybe like this, because initializer inside mutate() can be a very long expression
mutate(Person x)
{
  x = new Person 
  {
    dad = new Person(),
    mom = new Person()
  };
  mutate(var d = x.dad, m = x.mom)
    d.children = m.children = new[] { x };
  return x;
}

The code above shows how easy it would be to perform any operation on an immutable class that has not been shared yet. But once we are outside the mutate block it's safe. The only "risk" is to misuse a mutate block after an immutable class has been "published". I think that's acceptable (devs will always find their way to abuse, if only by reflection or unsafe code).

Note that to provide really strong guarantees any mutation context (either mutate block, method or setter) must be forbidden to store a reference to the immutable class or one of its Reference members into a static field or a capture context. I don't know if this should be enforced by compiler or is something that the dev must be responsible for.

I thought about using that as a replacement for the unsafe proposition in the issue, but it doesn't really work. The problem is that once you have (private) mutable members as an implementation detail, you pretty much have to wrap all your code with mutate(this). Even a getter may mutate a non-immutable class :(

jaredpar commented 9 years ago

@jods4

The only "risk" is to misuse a mutate block after an immutable class has been "published". I think that's acceptable (devs will always find their way to abuse, if only by reflection or unsafe code).

I disagree, that is precisely the problem that immutable types attempt to solve. They can be used without any context on how the type was created or care about who else has a reference to them. Once the possibility of mutations are introduced, even in a specified context, that guarantee goes away and they are just another mutating value.

The pattern you are describing here is valid but it more closely describes read only semantics vs. immutable.

sharwell commented 9 years ago

@jaredpar How would you have handled the StorageMetadata issue I described above? Edit: Not saying I disagree with you. This is simply an unsolved problem for me in the area of easy-to-use immutable types.

jods4 commented 9 years ago

@jaredpar C# provides you memory safety, but you can shoot yourself in the foot inside an unsafe block. To me, the mutate block is the same. Immutable are safe as long as you don't introduce a mutate block. Bonus: it makes it super-easy to construct your immutable objects, alleviating the need for lots of unwieldy APIs -- this is the usage mutate would be intended for and it is safe.

I also would like to point out that immutables will never be 100% safe in C#:

If you want immutable to be successful I think that you need to come up with a good solution for the construction problem (hint: the currently available T4 apis don't even come close).

The pattern you are describing here is valid but it more closely describes read only semantics vs. immutable.

I honestly don't understand why you say that.

jaredpar commented 9 years ago

@sharwell

Essentially the problem of having easy ways to new instances of immutable values with different values for the fields?

This is a difficult nut to crack because of user defined constructors. They provide the guarantee of code that will execute for every single instance of a type. It's extremely useful for establishing invariants (the ImmutableArray<string> values will be non-empty). But this makes it really hard to provide a mechanism where the compiler generates helpers to change field values.

I think it would be more feasible with features like records though or possibly primary constructors. It's not a slam dunk but those features could be made to work with generated helpers.

jaredpar commented 9 years ago

@jods4

Immutable are safe as long as you don't introduce a mutate block.

And this is a design I simply don't agree with. The definition of immutable should need no qualifier, it is implicitly safe and requires no external verification or additional thought. I have a value and it won't change. Period.

C# provides you memory safety, but you can shoot yourself in the foot inside an unsafe block.

The unsafe keyword can do pretty much anything in C#. It can violate memory safety, type safety, mutate a string contents, etc ... It exists to facilitate low level code and places with high levels of interop. It should be treated as the dangerous item that it is, not as an excuse to reduce the safety in other features.

If you want immutable to be successful I think that you need to come up with a good solution for the construction problem

I agree that construction is a problem but I think it can be solved with existing patterns that don't reduce the deep guarantee provided by the current model. Essentially have simple constructors which mirror the field layout of the type. I've seen this approach successfully used on a very large code base with a high number of immutable types.

I honestly don't understand why you say that.

The design you are proposing essentially partitions object holders into two categories:

This pattern is much closer to how read only is used in the language and the BCL. For example:

jods4 commented 9 years ago

Essentially have simple constructors which mirror the field layout of the type. I've seen this approach successfully used on a very large code base with a high number of immutable types.

I can come up with tons of examples where I've used 'immutable-like' types in the past that won't fit in:

Not saying my solution was ideal and we can try to come up with something different. But I think we need something better than a ctor with all properties as parameters. If this goes into the language it needs to be a good solution, I'd hate continuing to carefully use my own "immutable" classes because the safer built-in immutables are not easy enough to use.

Here's another idea I had... it's more complicated to implement and has one case that it doesn't handle: you proposed to introduce 'move' semantics into the language. Let's imagine that you can create a new mutable 'immutable' class (more or less as I described above), but that this mutable variable has a single ownership. Any copy has to be with move semantics. Calls to other methods would 'lend' the variable (à la Rust) but they wouldn't be able to store or capture it anywhere. On top of that, all its Reference fields have the same 'single ownership + move' semantics (transitively). Once you are satisfied with your object, you do a special last 'move' that makes it immutable. Now there is no more single ownership, but the new variable is of immutable type. As I said, much more complicated, but 100% safe and allows any mutation until 'publication' (that looks like a compiler-enforced Freezable). Only limitation that I see: you can only reference an immutable instance once in an immutable graph (because of the single ownership rule).

The design you are proposing essentially partitions object holders into two categories:

OK now I understand what you meant with readonly. My vision was that mutate would only be used for complex construction so I didn't see it in that "dual" way. Of course (like unsafe), devs could indeed do stupid thing and use it where they shouldn't, which would loose all safety that they may have. :(

As a closing thought, I would like to point out that the unsafe immutable class proposal will allow just the same dual world. Imagine that I want a very efficient ImmutableMatrix and that I'm thinking of simply taking an existing array in my ctor to avoid any copy cost. As I understand it the unsafe immutable class will allow me to do that. And I will then have those parts of the code that may still mutate the original array and those that have my ImmutableMatrix.

MgSam commented 9 years ago

I see a few problems:

public class Foo //This guy can never use the new immutable keyword
{
    private Data _data;
    private class Foo() { }
    public static async Task<Foo> Create(String bar) 
    {
         var foo = new Foo();
         foo._data = await someLongRunningOperation(bar);
         return foo;
    }
}
Clockwork-Muse commented 9 years ago

@MgSam - Is there something I'm unaware of about async, etc, or would passing the result into a single-arg constructor work? Would the compiler do dangerous things if I tried doing so?

jods4 commented 9 years ago

@MgSam

I find the "cannot be mutated by anything external to the class" definition of immutability to be much more useful in the code I write everyday.

One important use case for immutability is writing concurrent code. If the internal state of an "immutable" class changes then all thread-safety guarantees are lost, even if the changes are not observable from outside the class (e.g. self-optimizing search trees).

@Clockwork-Muse A single arg that takes _data? That's OK but you'll soon have tons of args because real-world classes have a lot more than a single field. :(

ashmind commented 9 years ago

@sharwell

The fun part is generic type constraints.

Why can't you just apply the attribute to generic parameter itself? That's what I did with [ReadOnly] in https://github.com/ashmind/AgentHeisenbug.

khellang commented 9 years ago

The string interpolation in the examples isn't valid syntax anymore.

public string FullName => "\{FirstName} \{LastName}";

should be changed to

public string FullName => $"{FirstName} {LastName}";

:smile:

sharwell commented 9 years ago

@ashmind In other words, a generic type Foo<T> marked [Immutable] could only include a field with type T if it were declared like this:

[Immutable]
class Foo<[Immutable] T1, T2>
{
    // allowed:
    private readonly T1 _value1;

    // compile-time error (field of immutable type is not readonly):
    private T1 _value2;

    // compile-time error (generic type parameter T2 is not immutable):
    private T2 _value3;
}
sharwell commented 9 years ago

Also, I would like to relax one of my previous rules:

A private field of an immutable type does not have to be marked readonly. However, the locations where the filed can be assigned is restricted by the compile-time analysis of immutable types. In particular:

  1. A field of an immutable type can include an initializer.
  2. A field of an immutable type can be assigned in the constructor.
  3. A field of an immutable type can be assigned prior to the point where the instance is "exposed". In the initial implementation this would likely have the following form:

    TypeName value = new TypeName(...); // or (TypeName)MemberwiseClone()
    value.field = ...; // allowed
    value.field2 = ...; // allowed
    value.Method();
    value.field3 = ...; // not allowed (value could have been exposed)
    return value;
    • Calling an instance method on the newly-constructed instance is considered "exposing" the instance, even if that method is marked [Pure].
    • With the exception of passing a value type by value, using the instance as an argument in a call is considered "exposing" the instance.
    • Returning the instance is considered exposing it.
jods4 commented 9 years ago

Maybe I'll state the obvious here, but I think that the problem of initializing an immutable graph has some overlap with the problem of initializing non-nullable references, which is another proposal under consideration. For instance, how could I create a graph of non-nullable references? Figuring out a unified solution to both situations would be interesting.

biqas commented 9 years ago

@pharring as stated in the #1125 issue, an immutable modifier is required. Readonly fields are not solving it.

Example:

public class B
{
    public int Value;
}

public class A
{
    private readonly B b;

    public B B { get { return this.b; } }

    public A()
    {
        this.b = new B();
    }
}

you can do following

var a = new A();
a.B.Value = 4;

but i was proposing that this should not be allowed. And to detect such modifications an extra immutable modifier is required.

public class A
{
    private immutable B b;

    public B B { get { return this.b; } }

    public A()
    {
        this.b = new B();
    }
}
var a = new A();
a.B.Value = 4; // compile error!!!

So it is not important a type it self is immutable, more important from which context it is used.

cdauphinee commented 9 years ago

I'm just curious, is there a reason you're proposing immutable classes, rather than a more general solution to C#'s entire category of problems involving immutability? For example, something akin to the C++ const qualifier.

vladd commented 9 years ago

@cdauphinee С++'s const qualifier is indeed rather weak. It sort of guarantees read-only view on object (if everyone agrees not to use const_cast), but doesn't guarantee immutability (anyone may have a mutable view as well, and may pull the rug from under your feet and change the object while you think it's still the same). Immutable, on the other hand, implies reliable, true, genuine, compiler-guaranteed unchangeable objects.

biqas commented 9 years ago

Here is what I had in mind when I thought about this feature.

1.Language 2.Threading 3.Garbage Collection (GC) 4.Security 5.Predictions

  1. Language Would require to have at least one additional keyword or more (immutable, immute). There must be some analysis regarding correct use of immutability, for the analysis I would recommend the project "Chess" http://research.microsoft.com/en-us/projects/chess/ which could be a good starting point. There could be also some new constraints introduction. For constraints I think there is some ongoing discussion here, which need some CLR extension, but this is not needed in the first shot. But if talking about constraints maybe talking about modifier(s) is not applicable. And then there is an inheritance model for immutability. So currently I have not thought a lot about how inheritance model could work. Closure optimization for immutable type references.
  2. Threading To guaranty thread safety if using immutable types would depend on some analysis which then can postulate correctness. Optimization in parallel executions like local variables can be achieved. Locks can be optimized if all type references which are used in the lock scope could be transformed to immutable type references.
  3. GC Because of predictable memory consumption there could be some optimizations in memory allocations/reuse.
  4. Security Did not thought about if there could be potential security issues if the feature is deeply entrenched in the system.
  5. Predictions Because of the nature to be immutable someone could think of make behaviour also repeatable, in that case you could think of caching results for repeated access/invocations.

Taking that in mind I would like to make the proposal to have the immutability concept like that:

  1. By definition (partially/complete) | "immutable"
  2. By scope (explicit/implicit) | "immute" or "immutable"
  3. By Definition By definition is meant to decorate a modifier (immutable) on several nodes (class, struct, interface, field, property, event, indexer, method, namespace?) during the design phase. So if some node is caring such modifier(s) maybe it should be also inherit dependant nodes.
  4. By Scope To have the ability convert a non immutable type reference to an immutable one (immute). To achieve this, the visibility and type conversion aspects must be enlighten more in depth.

These are ruff ideas, what currently is missing, is to look more in detail how the data-flow would be if introducing immutability, maybe some parts are redundant maybe some parts are missing.

#01
// Class with immutable modifier.
// To have a modifier here has a lot of implications.
public immutable class A
{
    #02
    // Because of type Result is not immutable, but should be used here as one,
    // some methods and properties must be in accessible, to guaranty immutability.
    // Maybe immutable modifier is redundant.
    // Maybe private set accessor is also redundant.
    public virtual immutable Result Result { get; private set; }

    #03
    public A(int @value = 0)
    {
        #04
        // Assignment like read-only fields.
        this.Result = new Result(@value);
    }

    #05
    // This method would not make lot of sense,
    // because is trying to store in instance backing store, but type A is immutable!
    public void Calculate1()
    {
        #16
        // Not valid, because immutable.
        this.Result = new Result(1);

        #07
        // Not valid, because entry reference is marked as immutable.
        this.Result.Value = 2;
    }

    #08
    // immutable modifier maybe here not needed.
    public A Calculate2()
    {
        #09
        return new A(42);
    }

    #10
    // Here are lot of questions, what should happen if you immute type B?!
    // Maybe should be restricted.
    public B Calculate3()
    {
        #11
        return new B(42);
    }

    #12
    public immutable Result GetResult1()
    {
        #13
        return this.Result;
    }

    #14
    // Because the return type is not marked as immutable,
    // the new reference is loosing the ability.
    public Result GetResult2()
    {
        #15
        return this.Result;
    }

    #16
    public immutable Result GetResult3()
    {
        #17
        // Not sure if implicitly an immutable reference
        // is created or should lead to an error.
        return new Result(1);

        // immute return new Result(1);
    }
}

#01
public class B : A
{
    #02
    // Not thought a lot about that.
    public override immutable Result Result { get; set; }

    #02
    public B(int @value = 0) : base(@value) {}

    #03
    public Reset()
    {
        #04
        // Not thought a lot about that.
        this.Result = new Result(0);
    }
}

#01
public class Result
{
    #02
    private int _value;

    #03
    public int Value
    {
        #04
        get { return this._value; }
        #05
        set { this._value = value; }
    }

    #06
    public Result(int @value)
    {
        #07
        this.Value = @value;
    }

    #08
    public void Reset()
    {
        #09
        this.Value = 0;
    }
}

#01
// Not thought a lot about that.
public immutable class Result2
    #02
    : Result
{
    #03
    public Result2(int @value)
        #04
        : base(@value)
    {
        #05
        this.Value = value;
    }
}

// Not thought a lot about that.
public class Generic<T> where T: immutable {}
#01
var a = new A();

#02
// Not thought a lot about that.
// Give immutability to reference if possible.
var b = immute a.Calculate3();
GSPP commented 9 years ago

If immutable types are defined to have no identity (meaning that they cannot be reference compared and hashing is based on value) then the CLR is free to allocate them like structs or like classes. However it sees fit. The difference would not be detectable. It can even use a mixed model. This would be subject to an optimization policy.

alrz commented 8 years ago

How about a immutable list just like FSharpList<>? then it can support head::tail pattern, head::tail (cons operator) and list@list (list concatenation operator).

Clockwork-Muse commented 8 years ago

@alrz - most of this discussion has been about the underlying mechanisms. You could write a readonly-based implementation today that would need minimal conversion work later (I would be completely unsurprised if one already exists, though).

jviau commented 8 years ago

If C# implements immutable types, I would like to see language level support for building and transforming immutables (return a new object with the desired changes). Currently this is achieved with either the builder pattern or WithXXX pattern. Each with their pros and cons.

Builder

Pros

  1. Can to add new properties and methods and not break code compiled against older binaries.
  2. Allows you to perform batch changes without creating several immutable copies.

Cons

  1. Still is creating 1 (or more) extra object(s), which can be a noticable impact on the GC if you are modifying a large structure, like an immutable tree.
  2. Potentially verbose depending on the builder implementation (mutate via chainable methods vs setting properties)
    • Create builder, change each property one by one, convert to immutable.
  3. Requires maintaining another class.
    • For best builder experience you also want to ensure nested objects have builders as well.

WithXXX

Pros

  1. Allows changing of a property in one call (which returns a new object).
  2. Default properties can be used to create a single With method that allows changing of all properties in a single efficient call.

Cons

  1. A lot of methods to maintain
  2. No batch property changes when using WithXXX
    • Not the case with default properties
  3. Adding proprties and then changing the signature of the With method will break code compiled against previous versions

Language Support

As you can see both of these methods work, but come with downfalls. My suggestion is to solve this with something similar C# object initializer.

immutable class Foo
{
    string Bar;
    int Bazz;
}

// Initialize Foo
var foo = new Foo { Bar = "bar", Bazz = 10 }; // properties of Foo cannot be changed after this point

// Transform Foo, instantiating a new version of Foo with whatever fields 
// specified and then the rest of the values taken from foo
var foo2 = foo { Bar = "changed" };

foo.Bar == "bar"; // true 
foo2.Bar == "changed"; // true
foo.Bazz == 10; //true
foo2.Bazz == 10; // true

This provides us with the best of Builder and WithXXX.

gafter commented 8 years ago

@jviau See #5172.

McZosch commented 8 years ago

The problem with writing immutable types is having a redundant private field list and constructor signature, both required to be synced with the immutable property list of a class. This is a pain in the ass.

What do we really need to overcome this?

The behavior we require is to have certain properties and members only be set once at object creation. For such cases, there is currently no keyword / attribute. My proposal is to have keyword or attribute "initial", which declares code only to be consumed during object creation.

public class Person
{    
    public string FirstName { get; initial set }
}

The compiler would allow the following

var person = new Person { .FirstName = "James" };

but disallow any following access like

person.FirstName = "Jim";

This would also work for mixed scenarios and would allow for methods or getters be available only at object creation time (initializers!). Finally, it would make designing immutable types only a minor variation of the mutable variant.

public class Person
{    
    public string FirstName { get; set }
}

What would additionally be really nice is to have better modification support.

var person1 = new Person { .FirstName = "James" };
var person2 = modify person1 { .FirstName = "James" };

This would kill the necessity to have the same immutable property pain at the consumer side. Maybe this could be a more concise way to clone ordinary objects, too!?

leppie commented 8 years ago

I would like to see this extended to generics too, for example:

void Foo<T>(T t) where T : immutable { ... }

Alternatively, go the C++ const way like

void Foo<T>(immutable T t)  { ... }

PS: Sorry if this has been mentioned.

AlexRadch commented 8 years ago

I suggest to use val for immutable variables.

val x = 10;
val string s = "some string";

int SomeFunction(val List<int> source); // can not change list
{
    val c = source.Count(); // correct
    source.Add(5); //  generate compiler error
}

class List<T>
{
   immutable int Count(); // can not change self
   void Add(T item); // can change self
}

(var int X,  val int Y) tuple1 = (10, 20); // Tuple with mutable X and immutable Y
var tuple2 = (var X:10, val Y:20); // Same tuple with mutable X and immutable Y
alrz commented 8 years ago

@AlexRadch check out #7626 for immutable variables.

jpierson commented 7 years ago

I think there is overlap with the discussion of a pure keyword. In that discussion I mentioned the idea being able to hoist the pure keyword to the class level in order to essentially allow constraining a class implementation to only pure members. In this approach a pure class would be equivalent to an immutable class because both imply that there is no mutation that can be done by it`s members.

The concept of deep immutability is more important than naming but I do like the way that the word pure more intuitively applies to both the class and class member levels.

Additionally I proposed the idea of an isolated keyword which would be less strict than pure in that it would allow mutation only of state locally owned by that class (ex. fields).

shaggygi commented 7 years ago

Any updates on this proposal? I recall immutable related features coming (or at least mentioned in Roslyn demonstrations) to C#. Is this topic something being looked at to make it in C# 8? Just curious.

shaggygi commented 7 years ago

@stephentoub @MadsTorgersen I'm assuming this would be a topic that would move to the new C# Language repo. If so, what would be the steps to take? Thx

dmitriyse commented 7 years ago

Immutability annotation can be applied also to a method arguments. For example if argument is R/W collection, then

public void MyFunc(immutable ICollection<object> t)
{
    t.Add(new object()); //Compile error/warning.
}

immutable keywork will inform developer and compiler that R/W collection will never change So it's sort of static code analysis in the compile time. The same proposal as for not-nullable referenced types. See also https://github.com/dotnet/csharplang/issues/219#issuecomment-283783552

soroshsabz commented 7 years ago

ITNOA

Why not use public readonly class C instead of public immutable class C? this solution does not need add new keyword such as immutable to language.

One other issue is that I think if we looked to each class for two-sided (mutable side and immutable side) is much more effective. It can easily happen by adding the ability to specify method, property and operator that guarantee the state of class does not change (similar to right const in C++). With this feature we do not have write two class (one mutable and another immutable) for each entity.

Although if we want write a class that have only immutable side, readonly class feature is useful to reduce extra writing right const (or similar keyword) for each method and ...

soroshsabz commented 7 years ago

I think some of this discussion go ahead in #115.