dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.97k stars 4.03k forks source link

C# Design Notes for Mar 24, 2015 #1898

Closed MadsTorgersen closed 8 years ago

MadsTorgersen commented 9 years ago

C# Design Meeting Notes for Mar 24, 2015

Quote of the Day: If we have slicing we also need dicing!

Agenda

In this meeting we went through a number of the performance and reliability features we have discussed, to get a better reading on which ones have legs. They end up falling roughly into three categories:

As follows:

  1. ref returns and locals <_green_> (#118)
  2. readonly locals and parameters <_green_> (#115)
  3. Method contracts <_green_> (#119)
  4. Does not return <_green_> (#1226)
  5. Slicing <_green_> (#120)
  6. Lambda capture lists <yellow - maybe attributes on lambdas> (#117)
  7. Immutable types <yellow in current form, but warrants more discussion> (#159)
  8. Destructible types <yellow - fixing deterministic disposal is interesting> (#161)
  9. Move <_red_> (#160)
  10. Exception contracts <_red_>
  11. Static delegates <_red_>
  12. Safe fixed-size buffers in structs <_red_> (#126)

Some of these were discussed again (see below), some we just reiterated our position. dotnet/roslyn#1. Ref returns and locals

At the implementation level these would require a verifier relaxation, which would cause problems when down targeting in sandboxing scenarios. This may be fine.

At the language level, ref returns would have to be allowed on properties and indexers only if they do not have a setter. Setters and ref would be two alternative ways of allowing assignment through properties and indexers. For databinding scenarios we would need to check whether reflection would facilitate such assignment through a ref.

A danger of ref returns in public APIs: say you return a ref into the underlying array of e.g. a list, and the list is grown by switching out the underlying array. Now someone can get a ref, modify the collection, follow the ref and get to the wrong place. So maybe ref returns are not a good thing on public API boundary.

There's complexity around "safe to return": You should only return refs that you received as parameters, or got from the heap. This leads to complexity around allowing reassignment of ref locals: how do you track whether the ref they are pointing to is "safe to return" or not? We'd have to either

There's complexity in how refs relate to readonly. You either can't take a ref to a readonly, or you need to be able to express a readonly ref through which assignments is illegal. The latter would need explicit representation in metadata, and ideally the verifier would enforce the difference.

This can't very well be a C# only feature, at least if it shows up in public APIs. VB and F# would need to at least know about it.

This feature would be a decent performance win for structs, but there aren't a lot of structs in the .NET ecosystem today. This is a chicken-and-egg thing: because structs need to be copied, they are often too expensive to use. So this feature could lower the cost of using structs, making them more attractive for their other benefits.

Even so, we are still a bit concerned that the scenario is somewhat narrow for the complexity the feature adds. The proof will have to be in the use cases, and we're not entirely convinced about those. It would be wonderful to hear more from the community. This is also a great candidate for a prototype implementation, to allow folks to experiment with usability and performance. dotnet/roslyn#2. Readonly locals and parameters

At the core this is a nice and useful feature. The only beef we have with it is that you sort of want to use readonly to keep your code safe, and you sort of don't because you're cluttering your code. The readonly keyword simply feels a bit too long, and it would be nice to have abbreviations at least in some places.

For instance readonly var could be abbreviated to val or let. Probably val reads better than let in many places, e.g. declaration expressions. We could also allow val as an abbreviation for readonly even in non-var situations.

In Swift they use let but it reads strange in some contexts. In Swift it's optional in parameter positions, which helps, but we couldn't have that for back compat reasons.

This is promising and we want to keep looking at it. dotnet/roslyn#4. Does Not Return

It would be useful to be able to indicate that a method will never return successfully. It can throw or loop.

The proposal is to do it as an attribute, but there would be more value if it was part of the type system. Essentially it replaces the return type, since nothing of that type is ever returned. We could call it never. The never type converts to pretty much anything.

This would allow us to add throw expressions in the language - their type would be never.

Having it in the type system allows e.g. returning Task<never>, so that you can indicate an async method that will only ever produce an exception, if anything.

Because of the Task example you do want to allow never in generics, but that means you could have generic types that unwittingly operate on never values, which is deeply strange. This needs to be thought about more.

If through nasty tricks you get to a point in the code that according to never types should not be reachable, the code should probably throw.

A common usage would be helper methods to throw exceptions. But throw as an expression is the most useful thing out of this. dotnet/roslyn#6. Attributes on lambdas

Why? Guiding an analyzer, e.g. to prevent variable capture. Syntactically it might collide with XML literals in VB.

We could probably hack it in. The attribute would be emitted onto the generated method.

paulomorgado commented 9 years ago

Have you given any consideration to nested methods?

gafter commented 9 years ago

@paulomorgado Yes, but that hasn't been assigned a color yet.

MgSam commented 9 years ago

A question- at what point in the process do you start nailing down the designs of some of the features proposed thus far? It seems like the design meetings currently jump from one feature to the next- it seems to me that trying to design features like that runs the risk of circular discussions as people forget what you discussed the last time the feature was brought up several months ago and forward progress becomes difficult to make.

I'd really like to see you guys nail down the records, tuples, and pattern matching features (perhaps another prototype?) and then move on to the next features once those have final or near-final designs. That would also allow you to consider having more incremental releases of the language.

Some other thoughts on the design notes:

-1. Very much opposed to this feature. Adds a lot of complexity to the language for questionable benefit. If you need to pass references around like this for performance reasons, you should probably be using C++ anyway as it gives you all the ultra-fine grained control over memory you could ever want. We shouldn't try to make C# become C++ Light.

-2. Could you elaborate on where let "reads strange"? I don't like val at all, it is way too similar to var and thus easy to miss the difference when reading code. Also, let should probably start with bonus points because a) it's already used in LINQ and b) it's used in Swift and thus makes the learning curve for multi-language developers smaller.

-4. This is a good idea and I hope you guys are able to work out all the corner cases.

-6. As I said in the thread for this feature, I don't think its particularly valuable as analyzers can already provide warnings around this. I'm fine with the attribute approach if there's really that much demand for it, but I suspect it would be very rarely used and thus probably not worth the design effort, given other useful features on the table for C# 7.0.

paulomorgado commented 9 years ago

@MgSam, regarding 4. what do you find not valuable? Methods having a return type of never or the fact that a never type exists and the throw statement will become an expression?

Joe4evr commented 9 years ago

@paulomorgado I think his 4 is actually 6, but became a victim of markdown auto-numbering. If you re-read Mads' post, the "Does not return" section doesn't talk about attributes and analyzers at all.

paulomorgado commented 9 years ago

I found it strange, but the first sentence matched it. :smile:

MgSam commented 9 years ago

@Joe4evr Crap, yes- this is the case. It renumbered my items after submitting. Fixed now.

HaloFour commented 9 years ago

@MgSam The following syntax works to prevent Markdown from renumbering lists: 123\.

  1. I'm also curious about how let reads strange. I know that grammatically it does read differently from var or val but I think that's excusable if you read it in the context of declaring a constant. I agree with MgSam that my preference for let comes from both being easily visually distinguishable from var as well as already existing in LINQ to perform a nearly identical purpose.

  2. Has there been any further discussion as to how method contracts will implement failure? The thread for dotnet/roslyn#119 got pretty extensive mostly over the concept of how the contract should fail. My opinion remains that if fail-fast is the (only) behavior and that if programmers will have to write canonical argument validation anyway that contracts will find very limited use, just like the Code Contract project it intends to replace. Ultimately I think it should be configurable by whatever is hosting the assembly.

  3. Considering the limited use case for this and how it would be an analyzer-only situation I don't see why a language construct would be that much more useful than just having a common attribute in the BCL. So you save a few keystrokes a handful of times, is that worth the expense of modifying the language?

  4. Any motion to maybe have the CLR support this? The feature seems like it would be really limited without that support since you can't slice arrays to arrays or strings to strings without allocation/copying and having to use an interface as an intermediary makes it unusable for most of the APIs that currently exist.

  5. Not sure what the capture list would have to do with attributes. Wouldn't it only affect how the compiler generates the closure class?

  6. Are "expection contracts" different from method contracts?

MikePopoloski commented 9 years ago

For 1), yes please! As a game developer, structs are my lifeblood. Any changes that make them easier and more efficient to use will be greatly appreciated. I often find myself using arrays over Lists of structs due to the overhead of copying them in and out of the indexer. Often I even allocate blocks of unmanaged memory via AllocHGlobal and carve out pieces of it for my value types. Being able to incorporate ref local / return support for that scenario would be great as well.

(This would be even nicer to work with if some of the restrictions around pointers to generic types were relaxed, but that's a separate feature to discuss.)

@MgSam, please don't tell people writing performance-sensitive code to just go back to C++. There are a million reasons why C# is a nicer language to work in, and there's no reason we can't be just as fast as C++, especially with .NET Native on the horizon.

dsaf commented 9 years ago

@MgSam

-2. Could you elaborate on where let "reads strange"? I don't like val at all, it is way too similar to var and thus easy to miss the difference when reading code. Also, let should probably start with bonus points because a) it's already used in LINQ and b) it's used in Swift and thus makes the learning curve for multi-language developers smaller.

const - a constant var - a variable val - a value let - anything?

a) syntax-based LINQ is more like an integrated DSL, does it really matter for the rest of the language?

b) Swift is not relevant (yet), it's not even mainstream in Apple development let alone other platforms. Much better example that proves the opposite would be very mainstream (granted ill-designed) JavaScript where let vs. var have nothing to do with mutability.

dsaf commented 9 years ago

I would go for var readonly instead of readonly var for symmetry with immutable-by-default F# that has let mutable.

MgSam commented 9 years ago

@dsaf I agree the word "let" isn't an abbreviation of the concept it represents, but so what? This is a language symmetry argument, I'm much more concerned about how the keyword will play in practice, not whether it is nice from a theory perspective.

Also, re: b), this is Apple we're talking about- once a tech falls out of favor (Objective C) there's a ticking clock until they drop support altogether. They've been pretty clear Swift will be the future for their products.

gafter commented 9 years ago

@MgSam The design group is trying to decide on a reasonable full set of things to work on before deep diving into an arbitrarily selected feature. Once we've considered the full set of things we might do, we'll narrow it down and then begin the design work in more detail. That design work might cause us to reconsider the set of features as a whole, and that's fine.

nil4 commented 9 years ago

Another vote for let over val, the latter being much too similar to var. Keeping in mind that code is read a lot more often than it's written, every such occasion is an opportunity to inadvertently miss the difference between the two (think of reading or reviewing code, diffs or pull requests outside an IDE). In addition to being used by LINQ, it is already familiar to developers using e.g. Javascript, Rust, OCaml, Racket and Clojure.

jnm2 commented 9 years ago

I think val is clever and almost too perfect as complement of var, so it is with reluctance and sadness that I would vote for let as the keyword which people both from C# and from other languages will instinctively expect for that concept. val/var may also increase errors, contrary to the pit of success you guys are shooting for.

Now if only let on a parameter didn't sound so dorky...

Przemyslaw-W commented 9 years ago

I am not native English and let on a parameters does not sound dorky to me at all :)

2015-04-10 23:34 GMT+02:00 jnm2 notifications@github.com:

I think val is clever and almost too perfect as complement of var, so it is with reluctance and sadness that I would vote for let as the keyword which people both from C# and from other languages will instinctively expect for that concept. valvar may also increase errors, contrary to the pit of success you guys are shooting for.

Now if only let on a parameter didn't sound so dorky...

— Reply to this email directly or view it on GitHub https://github.com/dotnet/roslyn/issues/1898#issuecomment-91698241.

HaloFour commented 9 years ago

@jnm2, @Przemyslaw-W

IIRC readonly would be used as the keyword to specify that a parameter or formally-declared parameter is read only. The let or val keyword would serve as a shorthand for implicitly-typed readonly variables, a counterpoint to var:

public int Divide(readonly int dividend, readonly int divisor) {
    let quotient = dividend / divisor;  // same as readonly int quotient
    return quotient;
}
ufcpp commented 9 years ago

I usually type only "v" and use IntelliSense to get var keyword. Therefore, if 'val' were used for a readonly decralation, I'd be confused to use var and val depending on situations.

vladd commented 9 years ago

As already mentioned in other discussions, var and val are too close, perhaps let is better. (And, let is already a keyword with a similar meaning.)

aluanhaddad commented 9 years ago

Regarding Item 2, I vote for val because it reads much better than let in the proposed contexts. I think that the syntax used by Swift reads very poorly especially when mutable and immutable declarations are mixed. In my experience writing Scala, I have not found the two keywords to be overly similar. In practice I think the lexical similarity is a non-issue, and the improved readability and orthogonality are desirable.

lucasmeijer commented 9 years ago

Hi @MadsTorgersen, as you mentioned it would be great to hear from the community about use cases for ref returns, here we go :)

Background

Unity (a game engine https://unity3d.com/unity) uses C# for scripting but has the majority of the engine code in C++. Our customers write almost all game code in C#. C# in games usually differs in two ways from normal C# usage:

Problem We are currently working on making high performance computation in C# a reality. A good example is particle systems. We have a particle system with lots of features. But many of our customers want to extend it and have their own custom particle simulation code to give special effects unique to their game. Like making a particle system attractor or force fields etc. The current API & usage goes like this and uses arrays:

struct Particle
{
    Vector3 position;
    float   lifetime;
    Color32 color;
    ...
}

// Allocation is controlled by user, the particle system API does not allocate
// So allocations can be done once at startup and never during gameplay.
Particle[] particleArray = new Particle[1000];
...

void Update ()
{
    int nbParticles = particleSystem.GetParticles(particleArray);
    for (int i=0;i<nbParticles;i++)
        particleArray[i].position.y += 5.0F * deltaTime;

    particleSystem.SetParticles(particleArray, nbParticles);
} 

The issue with this approach is that we copy the particle buffer (It's easy to imagine 10.000 particles -> 700kb of data) into managed land. Then we modify the data in the array. Then we modify the particles, then we copy them back into the array. It is easy to see how this approach can easily be 3x slower than it has to be for simple computations, due to the extra copies.

We have experimented with our own Buffer struct. Which gives us direct access to unmanaged memory allocated by the engine.

struct Buffer<T>
{
    IntPtr buffer;
    int    stride;
    int    length;

    unsafe public T this[int x]
    {
            get ...
            set ...
    }
    public int Length
    {
            get ...
    }

    // NOTE: the Buffer struct also has some debug only mode 
    // which allows it to track when a buffer has been deallocated on the C++ side
    // and throw exceptions.
    // So the Buffer struct is in fact safe, but that part is not relevant for discussion here.
    // Also note that the buffer class itself throws exceptions when using it with any non-value types.
}

struct Particle
{
    Vector3 position;
    float   lifetime;
    Color32 color;
    ...
}

// The following code, directly writes into the particle buffer on the C++ side. No extra copies.
// Performance issues are solved...
// Additionally no managed array has to be allocated, which reduces overall memory consumption dramatically, 
// as well as generally keeping the GC heap tighter.
// Also the code is much simpler.

// This is what we want to write!
void Update ()
{
    Buffer<Particle> buffer = particleSystem.GetParticleBuffer(particleArray);
    for (int i=0;i<buffer.Length;i++)
        buffer[i].position.y += 5.0F * deltaTime; // But this line gives us a compile error in C#
}

// This is what we have to write today.
// And it is bad for many reasons, see below.
void Update ()
{
    Buffer<Particle> buffer = particleSystem.GetParticleBuffer(particleArray);
    for (int i=0;i<buffer.Length;i++)
    {
        Particle particle = buffer[i];
        particle.position.y += 5.0F * deltaTime;
        buffer[i] = particle;
    }
}

The first example doesn't compile today. This not being possible is bad for two reasons

1) Performance. The extra copies especially for larger structs hurt performance. Our goal is generate the most optimal code we can. Extra mov instructions will make it impossible for us to achieve this goal. 2) it is very inconvenient having to create temporary variables. Especially since C# programmers are used to arrays, and we want the syntax to be a 1:1 mapping.

The general issue is that today, structs in C# are severely limited. Structs are also the key to high performance computation in C#, because it is the only way in which you can accurately control memory layout, which is naturally by far the key concern for high performance code.

Solution: ref returns We believe that ref return solves our problem. https://github.com/dotnet/roslyn/issues/118

I hope that this post shows why we believe ref support in C# is an essential part to do high performance computation in C#. Without ref returns support, it will not be possible to achieve performance to match C++ for computation intensive code. The result is that we have to ask our customers to write their particle systems in C++, while we would love to ask them to write them in C# instead.

This problem is everywhere in the domain of games. This particle system is just one example.

gafter commented 8 years ago

Design notes have been archived at https://github.com/dotnet/roslyn/blob/future/docs/designNotes/2015-03-24%20C%23%20Design%20Meeting.md but discussion can continue here.