Closed stephentoub closed 8 years ago
If I recall, Eric Lippert blogged about this some years back and the response in the comments was largely negative.
I do not like this feature for C#. The resulting code is like an uglier version of C++, and code written with it takes longer to reason about and understand. The use-cases are not particularly compelling, and I have never run into a situation where I wished I had ref
locals or return values.
Yes, I know very well that mutable structs should be avoided. Still, one interesting use case would be lists of mutable structs. Consider:
struct MutableStruct { public int X { get; set; } }
MutableStruct[] a = ...
List<MutableStruct> l = ..
a[3].X = 5; // changes the value of X of the struct in the array
l[3].X = 5; // compile time error
If the indexer of the List<T>
class would return the value stored in the list by reference, the code above would compile, making the use of mutable structs less surprising. It is probably even more efficient as the (potentially large) struct no longer has to be copied out from the list.
Unfortunately, I doubt that the return type of List<T>
's indexer can be changed for backwards compatibility reasons.
Disclaimer: I work on game engine, so I am probably not the typical user.
One use case this could really help us is this one:
MyHugeStruct[] data; // we use a struct to improve data locality and reduce GC pressure
// Ideally, we would like to be able to use List<T>, but we can't take ref then
for (int i = 0; i < data.Length; ++i)
{
// Option 1: make a local copy (slow)
var item = data[i];
// Option2: To avoid making a stack copy of MyHugeStruct,
// we have to defer to a inner loop function
MyLoopBody(ref data[i]);
// Option3: using new proposal, that would be much better:
ref MyHugeStruct = data[i];
}
We end up making separate function for loop body, and in case of tight loop this can end up being quite bad:
Nice to have:
Extra (probably impossible without changing BCL):
What happens with this?
var data = GetData();
...
ref SomeStruct GetData()
{
var ss1 = new SomeStruct();
var ss2 = new SomeStruct();
return ref Choose(ref ss1, ref ss2);
}
ref SomeStruct Choose(ref SomeStruct ss1, ref SomeStruct ss2)
{
return whatever ? ref ss1 : ref ss2;
}
GetData
might not be aware that Choose
is returning one of its variables and returns to the caller a reference to it.
Does the value still exist after exiting GetData
?
@paulomorgado You would not be allowed to return a ref to a local variable or parameter.
@gafter, the only difference between my Choose
method and @stephentoub's one is that mine does not have the selector passed as a delegate. Did I miss something here?
@paulomorgado, the compiler would only let you return a ref to something that it knew was either on the heap or that came from the caller. In my example, the ref inputs to the Choose method were all from ref parameters (or ref locals to ref parameters), so the compiler would conclude that the result of the Choose method met the criteria and would allow its returned ref to be returned. But in your example, the refs passed to Choose were not from the caller nor from the heap, such that the compiler couldn't be sure that the result of Choose was allowed to be returned, and it would error out.
@stephentoub, forget my Choose
method. Your's is the best that can be done and you just published it to NuGet and I added it to my project. How can the compiler know where the return valur of Choose
is coming from? My GetData
is just complying to the contract of Choose
to get its result and pass along as all the code written so far and to be written in the future does.
What you're saying is that publicly exposed methods can't return ref
s, which reduce the usage to only private methods.
@paulomorgado, I understand the confusion, but that's not what I'm saying.
There would be some rules about what it would be safe to return, e.g.
Forget the implementation of Choose here. Assuming Choose abides by these rules (which the compilation of Choose would enforce), in my example all of the inputs to Choose were valid to be returned, therefore the result of Choose could be returned. In your example, at least one of the inputs to Choose wasn't valid to be returned, therefore the result of Choose could not be returned. The compiler can validate that.
@stephentoub, what I'm having trouble with is understanding how those rules can be effectively enforced.
And a proposal should have an example that works under the proposal.
@paulomorgado, how does my example not work under the proposal? And why do you believe the rules can't be enforced?
@stephentoub, either that or I totally missed everything.
My understanding is that there's no way the caller can take the result of your Choose
method as safe to return as reference. Is there? If so, how?
@paulomorgado, in this example:
public static ref TValue Choose<TValue>(
Func<bool> condition, ref TValue left, ref TValue right)
{
return condition() ? ref left : ref right;
}
left and right are both safe to return because they came from the caller.
In this example:
public static ref int Max(ref int first, ref int second, ref int third)
{
ref int max = first > second ? ref first : ref second;
return max > third ? ref max : ref third;
}
first, second, and third are all safe to return because they all came from the caller. max is safe to return because the only refs it's possibly assigned to are those which are safe to return.
If I as a caller wanted to use Choose, e.g.
public static ref TValue ChooseByTime<TValue>(
ref TValue left, ref TValue right)
{
return Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
}
Both left and right are safe to return because they came from the caller. Therefore all of the ref inputs to Choose are safe to return. Therefore the resulting ref from Choose is also safe to return. I don't need to worry about the implementation of Choose, because the compiler is enforcing all of these same rules on the implementation of Choose.
Both left and right are safe to return because they came from the caller. Therefore all of the ref inputs to Choose are safe to return. Therefore the resulting ref from Choose is also safe to return. I don't need to worry about the implementation of Choose, because the compiler is enforcing all of these same rules on the implementation of Choose.
But ChooseByTime isn't returning neither left nor right. It's returning the return value of Choose. Noting but the implementation details of Choose is saying its return value is the same as one of its parameters. What if Choose is an implementation of an interface?
You're restricting the use of Choose to cases where it works without any safeguards or proof that it's safe.
My example shows the opposite.
@paulomorgado, your example wouldn't compile... the compiler would error out exactly because it doesn't abide by the rules: your call to Choose is passed ref values that are not safe to return, therefore the result of your call to Choose is not safe to return. I'm sorry if I'm not explaining this well; not sure how to convey it differently.
Noting but the implementation details of Choose is saying its return value is the same as one of its parameters.
Ah, maybe this is the point of confusion. The implementation doesn't matter because the compiler assumes the worst: regardless of how a parameter is actually used, if any argument isn't safe to return, then the result of the call isn't safe to return. The compiler is conservative in that regard.
A conservative compiler that assumes the worst cannot assume the return value of Choose is safe to return.
Is this what you're proposing?
public static ref TValue ChooseByTime<TValue>(
ref TValue left, ref TValue right)
{
TValue result = Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
if (result == left) reurn ref left;
else if (result == right) return ref right;
else throw new Exception("Invalid value.");
}
Why do you say that? What specifically about this example do you believe is problematic?
Let's try something else: can you construct an implementation of Choose that will compile based on the aforementioned rules/explanations but where the caller of the method could not assume its return value was safe to return?
No I can't. Because I haven't been able to understand how this would work.
I can understand how, in your implementation of Choose, it is safe to return that reference.
What I can't understand is why its callers can safely return the same reference without intimately knowing its internals..
Because it wouldn't be allowed to return anything that's not safe in the case where the caller assumes it is safe. If the only thing the caller passes in are refs that are safe to return, then what could this method return?
Etc.
So, this wouldn't be safe, right?
public static ref TValue ChooseByTime<TValue>(
ref TValue left)
{
ref TValue right = default(TValue);
return Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
}
Correct, that would not compile.
Beautiful solution, I've wondered why this couldn't be done before.
@MgSam [The resulting code is like an uglier version of C++] Because of sentiments like this (i.e. 'anything I don't personally use should never be part of the language for anybody else either, even though the CLR itself has this capability'), it means our language is needlessly crippled in places where a very easy and beautiful solution like this gives us such a capability. As the gamer showed in the comment above, this can be a big performance win in some cases.
:+1:
Anytime I can pass a pointer instead of performing a value copy, I'm all for it. Are there good reasons to pass memory by value-copy? Yes. Should it always be the case? Absolutely not.
The resulting code is like an uglier version of C++
I agree, it is not pretty but it is very descriptive. It would be nice if the ref
keyword could be replaced with syntax we're all used to. Perhaps we could use *
in place of ref
because int* foo;
is "cleaner" and "easier" to read than ref int foo;
. I put "cleaner" and "easier" in quotes because it is incredibly subjective.
Yes, I know that *
is generally reserved for unsafe
but there's no reason the symbol cannot be reused, so long as one is reserved for a "safe" contexts and the other for an "unsafe" context.
Given the limitations listed above imposed to maintain a safe context I'm having a hard time envisioning the use cases for this feature. The real gains would seem to be in how structs can be used throughout the BCL with arrays, lists or other collection types.
Given the limitations listed above imposed to maintain a safe context I'm having a hard time envisioning the use cases for this feature. The real gains would seem to be in how structs can be used throughout the BCL with arrays, lists or other collection types.
Agreed. This is, in my opinion, a small step in the right direction though.
Would this implementaion allow for ref int[] intRefs = new ref int[512];
?
If it doesn't, then I am less excited than I originally was. If it does, read ref struct[]
is difficult. Is it a reference to an array of structures or an array of structure references?
Better to use struct*[]
in my opinion.
I don't disagree that ref something
is unattractive, however your use of *
is already legal C# syntax and implies an unsafe context. I'm sure that you know that, but I thought it warranted mention.
I imagine that the array scenario would likely depend on the proposal for fixed-size buffer enhancements, dotnet/roslyn#126. Once the size is determined and allocated I believe that would behave the same as a field or as a local.
I don't disagree that ref something is unattractive, however your use of * is already legal C# syntax and implies an unsafe context. I'm sure that you know that, but I thought it warranted mention.
I do. I also know that *
is only legal with an unsafe
block. Thus, the compiler could assume that *
needed to be "safe" unless in an unsafe
block. Therefore operations like int* p = ...; p++;
would no be legal, instead int* p
would have to point to safely referenced memory.
Yes, there would be complexities if devs started an unsafe
block, but there can rules established on this would work, etc.
FYI: the PR for the initial commit of a prototype dotnet/roslyn#4042
I support ref return
And can we have ref parameter in lambda?
@Thaina You can use ref
parameters in lambdas today as long as the signature of the target delegate defines those parameters as ref
:
public delegate void RefAction<T>(ref T arg);
RefAction<string> action = (ref value) => { value = "Hello World!"; };
string x = "";
action(ref x);
Console.WriteLine(x);
@HaloFour: What @Thaina probably means is that you can't capture a ref-parameter in a lambda.
@HaloFour Sorry I don't know that. Which version we can use ref lambda?
I use unity for such long time so I don't update new info of C# much
@axel-habermaier Maybe. That wouldn't be my first guess given the proposal they posted under, but it is terribly unspecific. IIRC ref
parameter capture would be wading too close to unsafe
territory since you'd basically have to stuff the address to a variable in the state machine class and the compiler could no longer control its lifetime.
@Thaina C# has always supported ref
and out
parameters for anonymous delegates and lambdas.
oh... I never know that we just can't (ref i) => {}. I just need to (ref int i) => {}
Thanks for your point
Sorry for necropost, but i have question. I found that ref properties will be supported but only for getter. Why couldnt it be resolved for setters too? I mean if we have
class Foo{
public ref int Number{
get;
set;
}
}
it could be resolved to public ref int get_Number(){...}
and public void set_Number(ref int){...}
And if there is reason for abandoning setters why not do like this:
class Foo{
public string Description{
ref get;
set;
}
}
so we still be able to have setter and getter in one property (or this is already the case?)
@BreyerW the main reason for disallowing setters in byref properties and indexers is that they would not be very useful. While you can make a ref for a field or an array element and return that from the getter, you cannot go the other way in the setter. If some use pattern is discovered, restriction on byref setters can be relaxed later, so it was decided to start with not allowing them.
Thanks for reply. I wonder - avoiding copy value types while passing to setter method isnt a good thing? And next thing - allowing non-ref setter alongside with ref getter is impossible? like i show in second example?
EDIT: Ah and if there isnt any dangerous situation with ref setter I dont see why we have to be so strict about this - if someone find pattern for ref setter then you dont have to cook special c# version in future, this already be enabled. And ref could be defined per acessor not per property. Obviously you are designers not I, so possibly there is something subtle i dont know ;).
@BreyerW An important part here is that byref properties and indexers have assignable getters. If a type has a byref indexer, you already can read and write elements without redundant copying. What would be the "obvious purpose" of a setter if property is already assignable via its getter?
In particular byval setter next to a byref getter would actually make assignments ambiguous.
obj.Description = "aaa";
is this a assignment to getter or invocation of a setter?
There are short and long term costs of adding language features and it is next to impossible to remove them. That motivates the design team to resist features with unclear utility or confusing behavior.
oh i think i understand now why you abandoned setter, one of the reason is that ref getter can work like setter thanks to returning ref so there is no point in having setter? If that true then i completely see why you abandoned this. Thanks for clarification, now i feel a bit dump, obviously overlooked that.
the only thing that can be missing is fact by using ref getter there might be problem with firing events like OnBeforeValueChange but this is feature of ref itself, not flaw of c# design
BTW dont forget to update PropertyInfo somehow so CanWrite return true if there is only ref getter or add new check property for signalising there is ref getter. I mention this because i use this class.
Nothing mentioned about foreach
. It would be nice to be able to write foreach(ref Struct item in arr)
.
@alrz Your suggest would be impossible from foreach implementation. foreach use IEnumerable interface to return Current which is not return ref from there
It need to do opposite. We should have IEnumerableByRef to override Current. And let foreach check that if the collection is IEnumerableByRef then it will return item as ref automatically
Or maybe it should enable ref keyword in generic. So we will use IEnumerable<ref Struct>
It would be the best if MS will implement all things it has IEnumerable attached to (all things in System.Collection) to implement IEnumerableByRef when the feature was finished
@Thaina How about this?
When iterating over an array (known at compile-time) the compiler can use a loop counter and compare with the length of the array instead of using an IEnumerator
@alrz Only array is possible with that kind of foreach. Which I think it should not be difference workflow. Instead, array should implement IEnumerableByRef if C# have one
It can be simply allowed only for arrays, and then translate to for( .. ) { ref T item = arr[i]; ... }
. I don't think that something like <ref Struct>
would be possible because it ultimately causes to outlive the local object which is not supported by CLR, AFAIK.
@alrz I apologize that I am very against the idea of making array a special thing again. Actually I am against the idea to make something special case. We have this special problem from the start that only array has indexer return by ref and now we try to fix it, everything should have indexer return by ref as array could do
Yeah I think <ref Struct>
is overkill too. Just IEnumerableByRef is enough
@Thaina I think nothing's wrong with special cases. foreach
already has, though, unobservable, special case for arrays to make it faster, and ref
locals also help to make things faster (avoid copying), so combining these two in an use case like this would be nice.
:+1:
(Note: this proposal was briefly discussed in dotnet/roslyn#98, the C# design notes for Jan 21, 2015. It has not been updated based on the discussion that's already occurred on that thread.)
Background
Since the first release of C#, the language has supported passing parameters by reference using the 'ref' keyword, This is built on top of direct support in the runtime for passing parameters by reference.
Problem
Interestingly, that support in the CLR is actually a more general mechanism for passing around safe references to heap memory and stack locations; that could be used to implement support for ref return values and ref locals, but C# historically has not provided any mechanism for doing this in safe code. Instead, developers that want to pass around structured blocks of memory are often forced to do so with pointers to pinned memory, which is both unsafe and often inefficient.
Solution: ref returns
The language should support the ability to declare ref locals and ref return values. We could, for example, now declare a function like the following, which not only accepts 'ref' parameters but which also has a ref return value:
With a method like that, one can now write code that passes two values by reference, with one of them being returned based on some condition:
Based on the function that gets passed in here, a reference to either 'left' or 'right' will be returned, and the M20 field of it will be set. Since we’re trading in references, the value contained in either 'left' or 'right' is updated, rather than a temporary copy being updated, and rather than needing to pass around big structures, necessitating big copies.
If we don't want the returned reference to be writable, we could apply 'readonly' just as we were able to do earlier with ‘ref’ on parameters (extending the proposal mentioned in dotnet/roslyn#115 to also support return refs):
Note that when referencing the 'left' and 'right' ref arguments in the Choose method’s implementation, we used the 'ref' keyword. This would be required by the language, just as it’s required to use the ‘ref’ keyword when passing a value to a 'ref' parameter.
Solution: ref locals
Once you have the ability to receive 'ref' parameters and to return ‘ref’ return values, it’s very handy to be able to define 'ref' locals as well. A 'ref' local can be set to anything that’s safe to return as a 'ref' return, which includes references to variables on the heap, 'ref' parameters, 'ref' values returned from a call to another method where all 'ref' arguments to that method were safe to return, and other 'ref' locals.
We could also use ‘readonly’ with ref on locals (again, see dotnet/roslyn#115), to ensure that the ref variables don’t change. This would work not only with ref parameters, but also with ref locals and ref returns: