Open svick opened 7 years ago
I assume the Closure holds a reference to the local variable and not a copy. Then it's all about performance vs memory. I'm all for performance.
@brandon942
I assume the Closure holds a reference to the local variable and not a copy.
Neither. The closure holds the variable itself. If you want to access the variable, you have to go through the closure.
Then it's all about performance vs memory. I'm all for performance.
I don't see that kind of trade-off here. Doing what I proposed would save some memory and also improve performance (by decreasing GC pressure and probably also by improving cache behavior).
This optimization has potential to save 16 bytes per closure on 64-bit platforms. The memory bytes is what matters for GC pressure here. Two small objects vs. one larger object is not going to make much difference.
It is expensive optimization to implement because of it changes the .NET type system. The first step should be to collect data about the benefit. E.g. how much of the GC heap in typical ASP.NET app are closures, how much would it save or make things faster.
Also, how would it work when there is more than one closure created in the method?
void M()
{
int i = 42;
FuncInt f = () => i;
FuncInt g = () => i+1;
f();
g();
}
@jkotas
The first step should be to collect data about the benefit. E.g. how much of the GC heap in typical ASP.NET app are closures, how much would it save or make things faster.
Yeah, that makes sense.
how would it work when there is more than one closure created in the method?
This example has only one closure (containing i
), but two delegates using it. In that case, the optimization would have to be applied only to one of the two delegate allocations. I.e. it would go from "1 closure allocation + 2 delegate allocations" to "1 fused delegate-closure allocation + 1 delegate allocation".
I now realize this could increase the amount of used memory if the non-fused delegate lived longer than the fused one (by the size of delegate's fields, which is 4 pointers). I'm not sure how big of a problem this is.
The first step should be to collect data about the benefit. E.g. how much of the GC heap in typical ASP.NET app are closures, how much would it save or make things faster.
If anyone is interested in doing this, you might want to have a look at some code I've written that uses CLRMD to analyse heap dumps. It shouldn't be too hard to adapt it, so instead of looking for strings
, it looks for instances of delegates
.
I have attempted to measure how much would this save by running the smoke test suite of the aspnet/MusicStore sample app 25 times. I'm not sure how typical app MusicStore is and this usage pattern definitely isn't typical (for example, it opens the home page as often as changing a user's password), but I think it could be close enough.
I ran the above under the VS memory profiler, exported allocations data into XML and processed the XML. The processed data can be seen here (430 kB).
Here is a screenshot of the profiler summary, so that you can see if it aligns with your idea of what a "typical" app does:
The results are:
DisplayClass
):
This shows that the number of allocated delegates and closures are not that far apart, which might indicate that many of them could be fused. If all closures were fused, this would save about 570 000 allocations (2.2 % of all allocations) and 8.7 MiB (0.62 % of all allocated bytes).
If this data is sufficiently representative, then I think it means that doing this optimization is not worth it, because the savings are tiny, compared with the significant cost.
On the other hand, this data shows that removing one or more of the 6 pointer-sized fields that are allocated for each delegate could give comparable savings in terms of allocated bytes, with much smaller cost.
I thought that the main pros of this feature is reducing number of objects and locality of memory access,not a total size of objects.
I saw that GC performance is affected mainly by number of objects.
I saw that GC performance is affected mainly by number of objects.
This post is discussing "GC heap scanning performance" that is just one of the multiple components of total GC cost.
BTW: This is not the first post with misleading information that I have seen on stackoverflow. I have tried to correct the misleading information a few times before, but it was always rejected by the moderators. I am not even trying anymore.
@svick 's back of the envelope estimate looks accurate to me. This is unlikely to have material impact on performance of real world apps and there are cheaper ways to achieve savings of similar magnitude.
I was looking into this feature as a way to replace the IActionResult
interface in ASP.NET Core MVC.
Today the core ASP.NET request processing primitive is RequestDelegate and would like to be able to use it in more places. Instead, MVC has IActionResult is basically the interface equivalent of this (well slightly different but that's the intent). Now I'm looking to unify the request processing and result processing and it would be nice to allow returning a RequestDelegate
directly.
public class HomeController
{
[HttpGet("/hello")]
public RequestDelegate Hello()
{
return context => context.Response.WriteAsync("Hello World");
}
}
This is workable but I would love to allow polymorphism to enable other code to change the results.
public class JsonResult : RequestDelegate
{
private object _value;
public JsonResult(object value)
{
_value = value;
}
public override Task Invoke(HttpContext context) => context.Response.WriteAsJsonAsync(_value);
}
public class HomeController
{
[HttpGet("/hello")]
public JsonResult Hello()
{
return new JsonResult(new { Name = "David" });
}
}
This would allow for common scenarios that change the result by inspecting the type and lighting up behavior (mutating state on it before execution)
Currently, the requirements for delegates are (from ECMA-335 §II.14.6 Delegates):
What if this requirement was weakened, so that delegates were allowed to be unsealed and also allowed to contain other members, especially fields? I think it would allow the C# compiler (and other compilers) to reduce the number of allocations required for lambdas that close over local variables, by fusing the delegate and the closure into a single object. (It also might allow F# to start using delegate types instead of
FSharpFunc
, but I'm not sure about that.)For example, consider the following C# code:
Currently, the generated code looks roughly like this:
Notice that there are two allocations for a single lambda: the closure and the delegate.
If the requirements for delegates were weakened, the generated code could instead be:
This way, the two allocations are fused, which should make the code more efficient.
Some other notes:
Func
andAction
types) in some future version of the framework should be fine. Existing code would then become more efficient just by using a newer compiler and targeting a newer framework.What do you think?