dotnet / csharplang

The official repo for the design of the C# programming language
11.36k stars 1.02k forks source link

Proposal: allow explicitlty declaring blittable structs as unmanaged #1495

Open Ultrahead opened 6 years ago

Ultrahead commented 6 years ago

This is a cosmetic proposal since the C# compiler enforces blittable constraints and indicate errors where applicable (for example with stackalloc).

The idea is to allow something like:

public unmanaged struct MyBlittable
{
   ... no reference-type fields allowed ...
}

By explicitly declaring a struct as "unmanaged" you are making clear your intention, which can be useful for code comprehension and maintenance. So, if someone on your team (or even your-self after a while) is about to modify the code in the struct, that person will know that, for some reason, no reference type fields where allowed on such struct.

Unknown6656 commented 6 years ago

I am not sure if the usage of the keyword is a good idea for this feature. I completely understand the use-case and I agree, that some kind of annotation would be good.

However, I think that this should/could be solved either with an attribute or a code analyzer...

Ultrahead commented 6 years ago

Well, at first I thought this was the goal of ref struct declaration in C# 7.2, then I realized that it real goal was to enforce stack allocation. And since the unmanaged keyword is now also used as a generic constraint, I then thought it could be used for struct declaration.

public unmanaged readonly ref struct MyBlittable
{
   public int[] Arrays; // Compiler error.
   ...
} 
asyncawaitmvvm commented 6 years ago

Without the keyword, how does the compiler know it's unmanaged? It has to infer.

Without the keyword, how does the maintenance programmer know it's unmanaged? He has to infer. That takes longer.

Nobody writes comments, but now they're going to write code analyers. Yeah.

Korporal commented 6 years ago

Ideally the language syntax would be designed better and this kind of thing might not be present. For example the way C# has "class" and "struct" (which reflect the storage nature of the allocated datum) is tied up with the type defintion.

It would have been better to decouple the type definition from the allocation method like this:


public type One
{
   public int Field;
}

and then use class/struct where we instantiate the type:


var ref x = new One();  

var val y = new One();

Then we'd have a single type name but would be able to create object instance or struct instances freely. To do stuff like this now requires a little fiddling around:


public struct One
{
   public int Field;
}

public class OtherOne
{
   public One data;
}

This allows us to have a single type (One) and also a class that has the same "shape" (OtherOne). This is currently the only way to have a single type (One) and have ability to create either kind of instance. The above strategy is such that we'd never have anything in the class other than a single struct member, then the class type becomes nothing more than a boxed version of the struct, this is all tedious though as anyone who's coded significant systems with this will know!

Of course this is never going to materialize now that the grammar and stuff are so entrenched.

HaloFour commented 6 years ago

@Korporal

struct/class don't dictate the storage characteristics of the type, they dictate the copy semantics.

theunrepentantgeek commented 6 years ago

@Korporal the problems with that approach were well understood before C# was designed, and I suspect they were a factor in the thinking that lead us to the design we currently have.

Assume you have the following definitions (using your suggested syntax):

public type Person
{
    public string FullName { get; }
    public string KnownAs { get; }
}

public type Parent : Person
{
    public IEnumerable<Person> Children { get; }
    public void AddChild(Person person) { ... }
}

The AddChild() method has to accept any Person - it doesn't make sense to require all children to be parents.

But, any local variable within the implementation of AddChild() would also be declared as Person - and using that would truncate the value to being just a Person; all of the knowledge of that particular person actually being a parent would be lost.

In my understanding, this problem is well known to C++ developers and is a factor in designing (and evolving) many APIs.

Korporal commented 6 years ago

@HaloFour

struct/class don't dictate the storage characteristics of the type, they dictate the copy semantics.

That is also true but not relevant to my point which is that the type itself need not be declared in two different ways that was a choice made by the designers but I don't think it is the only way to design this kind of thing.

Korporal commented 6 years ago

@theunrepentantgeek - Your example is fine but that doesn't mean the behavior you describe was inevitable, that too was a choice surely?

My example (which I should stress is purely for discussion not in any sense a suggestion for a change!) would of course require numerous changes to the way things are done now in C# - if one were to design such a language.

In my example x is an object reference - and is the same as it would be if One were declared as a class, likewise y is a value type reference and is the same as it would be if One were declared as a struct.

HaloFour commented 6 years ago

@Korporal

Allocation on the stack also requires that the size be very predictable, which is not possible for types that participate in a hierarchy. The only way it would be possible to have types which can behave as both structs and classes is to enforce a very strict set of rules on those types, particularly around inheritance. Otherwise you risk slicing or decapitation. This is a very real problem in C++.

Korporal commented 6 years ago

@HaloFour - Yes I agree that there are consequences to such a proposal (not that it is actually a proposal) but the compiler could in principle enforce rules based on whether an item is declared as ref or val rather than the type itself.

So regarding @theunrepentantgeek example

public type Person
{
    public string FullName { get; }
    public string KnownAs { get; }
}

public type Parent : Person
{
    public IEnumerable<Person> Children { get; }
    public void AddChild(Person person) { ... }
}

The compiler could simply refuse to let you do this:

var val x = new Parent();

In other words certain types would by defintion preclude certain kinds of declarations of them.

It's noteworthy that every instance of an object in .Net can be modelled as a class with an embedded struct even a class that has inherted a sub class, certainly insofar as field declarations are concerned anyway.

HaloFour commented 6 years ago

@Korporal

It's much more than that, though. The compiler/runtime would have to very strictly limit everything that you can do with that class and everything that the class can do internally. Any arbitrary member of that class could attempt to call deep into some framework code that would end up trying to assign that instance to an array or add to a dictionary or anything else. But that wouldn't be remotely safe if the storage is on the stack. Looking at any arbitrary type it'd be impossible to know if it itself is safe to use. (I think) this is why the CLR designers made the decision to strictly differentiate between the two. You'd need even stricter rules than you have with ref locals and returns, which would make them really cumbersome to use.

alrz commented 6 years ago

This is proposed as part of https://github.com/dotnet/csharplang/issues/688

Ultrahead commented 4 years ago

Any news about this in #688?