dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.87k stars 4.63k forks source link

[API Proposal]: GC level ObjectPooling APIs and more #97434

Open xiaoyuvax opened 7 months ago

xiaoyuvax commented 7 months ago

Background and motivation

This proposal is based on the rudimentary weakness of current implementation of Microsoft.Extensions.ObjectPool, inspired by the discussions in https://github.com/dotnet/aspnetcore/pull/52935#discussion_r1461473774, and https://github.com/dotnet/aspnetcore/issues/44901. wherein, the rudimentary problem/defect of taking advantage of ObjectPool in terms of performance, memory cost and manageability is discussed as the LIFECYCLEs of object instances to be pooled r not easy to manage, due to unable to determine reference count, especially in complicated code structure/hierachy.
Otherwise, to reset the pooled instance is neither performance friendly. The actually problem manifests over two dimensions: 1) spatially, multi-references: an object referenced by multilple other objects whose lifecycles might be hard to determine during runtime. 2) temporally, non-synchronizable lifecycles of referenced objects: one object referencing other objects whose lifecycles r not necessarily the same and is also hard to be determined during runtime (typical cases r properties referencing other types). thus it is not easy to determine when to Return() the objects. A programmer must be carefully design/manage the lifecycles of all types to be pooled as not to mess the things up, as brought challenges for code structuring/organization. simply speaking, the weakness/difficulty of Microsoft.Extensions.ObjectPool is : when to call Return()? therefore, i think: Can we not call Return()?

Therefore, i'd like to propose/expect a low-level solution to replace Microsoft.Extensions.ObjectPool at the GC Level, for i think GC level Object Reusability would be a breaking improvement to the memory management of dotnet. The thinking is simple, synchronize the object lifecycle management with that managed by GC, let GC do the dirty thing. ;-)

API Proposal

namespace System.GC;

public static partial class GC
{
    //This method register a type to be pooled thereafter, and allow GC not to recycle instances of this type of specified count
    //The implementation may adopt that of Microsoft.Extensions.ObjectPool somehow, but in a lower level fashion, so that 
    // programmer doesn't have to call Return().  
    //here IPooledObjectPolicy<T> is equivalent to  Microsoft.Extensions.ObjectPool.IPooledObjectPolicy<T>
    public static void Poolize<T>(IPooledObjectPolicy<T> poolPolicy);

    //This method allow retrieve an completely reinitialized instance of specified type T from the framework-level object pool.
    // the destructor and constructor should be orderly called upon the returned instance as to reinitialize it.
    public static T Renew<T>();

    //This method forcibly put an instance to the framework-level object pool, and is usually not neccessary
    public static T Pool<T>(T object);
}

API Usage


//Poolize a type
GC.Poolize<MyType>();

//Renew an object ( i'd strongly recommend C# to introduce a new keyword "renew" to facilitate this operation, as detailed in the last part)
// the Renewed object should be default and exactly like the result of new()
MyType obj= GC.Renew<MyType>();          // in next version of C#, it might introduce new keyworkd like MyType obj= renew();
...

//if necessary, forcibly and explicitly pool the obj before GC process it, as to shorten the lifecycle of the instance, 
// as will make the instance to be reinitialized to default..
GC.Pool(obj);

Alternative Designs

Strongly recommend C# introducing the "renew" keyword to implement the whole process of above 3 method.


//this syntax may register a type to GC.Pool, renew an instance;
MyType obj= renew Mytype();        

//this syntax explicitly pool an instance.
renew(obj);

//this syntax renew obj with a new instance while pooling the old instance
var obj=renew(obj);

the "renew" mechanism will be a great change to the memory management of C#, if implemented in the lowest level, where object should be reinitialized with lowest-possible-level measure, as to achieve best performance.

otherwise, GC.DePoolize() might also be introduced to unregister the type.

Risks

A dramatical change to the GC mechanism.

ghost commented 7 months ago

Tagging subscribers to this area: @dotnet/gc See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation This proposal is based on the rudimentary weakness of current implementation of Microsoft.Extensions.ObjectPool, inspired by the discussions in https://github.com/dotnet/aspnetcore/pull/52935#discussion_r1461473774, and https://github.com/dotnet/aspnetcore/issues/44901. wherein, the rudimentary problem/defect of taking advantage of ObjectPool in terms of performance, memory cost and manageability is discussed as the LIFECYCLEs of object instances to be pooled r not easy to manage, due to unable to determine reference count, especially in complicated code structure/hierachy. Otherwise, to reset the pooled instance is neither performance friendly. The actually problem manifests over two dimensions: 1) spatially, multi-references: an object referenced by multilple other objects whose lifecycles might be hard to determine during runtime. 2) temporally, non-synchronizable lifecycles of referenced objects: one object referencing other objects whose lifecycles r not necessarily the same and is also hard to be determined during runtime (typical cases r properties referencing other types). thus it is not easy to determine when to Return() the objects. A programmer must be carefully design/manage the lifecycles of all types to be pooled as not to mess the things up, as brought challenges for code structuring/organization. simply speaking, the weakness/difficulty of Microsoft.Extensions.ObjectPool is : when to call Return()? therefore, i think: Can we not call Return()? Therefore, i'd like to propose/expect a low-level solution to replace Microsoft.Extensions.ObjectPool at the GC Level, for i think GC level Object Reusability would be a breaking improvement to the memory management of dotnet. The thinking is simple, synchronize the object lifecycle management with that managed by GC, let GC do the dirty thing. ;-) ### API Proposal ```csharp namespace System.GC; public static partial class GC { //This method register a type to be pooled thereafter, and allow GC not to recycle instances of this type of specified count //The implementation may adopt that of Microsoft.Extensions.ObjectPool somehow, but in a lower level fashion, so that // programmer doesn't have to call Return(). //here IPooledObjectPolicy is equivalent to Microsoft.Extensions.ObjectPool.IPooledObjectPolicy public static void Poolize(IPooledObjectPolicy poolPolicy); //This method allow retrieve an completely reinitialized instance of specified type T from the framework-level object pool. // the deconstructor and constructor should be orderly called upon the returned instance as to reinitialize it. public static T Renew(); //This method forcibly put an instance to the framework-level object pool, and is usually not neccessary public static T Pool(T object); } ``` ### API Usage ```csharp //Poolize a type GC.Poolize(); //Renew an object ( i'd strongly recommend C# to introduce a new keyword "renew" to facilitate this operation, as detailed in the last part) // the Renewed object should be default and exactly like the result of new() MyType obj= GC.Renew(); // in next version of C#, it might introduce new keyworkd like MyType obj= renew(); ... //if necessary, forcibly and explicitly pool the obj before GC process it, as to shorten the lifecycle of the instance, // as will make the instance to be reinitialized to default.. GC.Pool(obj); ``` ### Alternative Designs Strongly recommend C# introducing the "renew" keyword to implement the whole process of above 3 method. ```csharp //this syntax may register a type to GC.Pool, renew an instance; MyType obj= renew Mytype(); //this syntax explicitly pool an instance. renew(obj); //this syntax renew obj with a new instance while pooling the old instance var obj=renew(obj); ``` the "renew" mechanism will be a great change to the memory management of C#, if implemented in the lowest level, where object should be reinitialized with lowest-possible-level measure, as to achieve best performance. otherwise, GC.DePoolize() might also be introduced to unregister the type. ### Risks A dramatical change to the GC mechanism.
Author: xiaoyuvax
Assignees: -
Labels: `api-suggestion`, `area-GC-coreclr`
Milestone: -
jkotas commented 7 months ago

I think we would welcome prototypes to prove that it is possible to improve overall runtime performance for real world scenarios by introducing APIs like this implemented at GC level. It is unlikely that implementing the proposed API at GC level can improve overall GC and runtime performance more than a fine-tuned standalone object pool.

xiaoyuvax commented 7 months ago

@jkotas so we need expert on this, to build object/memory reusing to the foundation of dotnet memory management rather than a design pattern. dramatically performance improvements may be not, less memory usage and allocation is still favorable, especially work with NativeAot, what if GC is no longer needed?? Would it be possible that object reusing replaces GC when compiled to native?? ...

I just found out and tested lately that call Return() in destructor may allow synchronizing pooling-cycle with the GC managed object lifecycle, as solves the lifecycle management problem to some extent (don't know yet if reusing a resurrected object from the destructor would be problematic), while still i have to write the destructor code and objectpool declarations for each type one by one, which is repetitively boring. i wonder why not straightly make this Return() operation "build-in" for every type (or at least for explicitly registered types)?

but it does not solve the reseting performance problem, neither by the lately introduced IResettable interface(actually one can write the resetting code in the destructor per se). so far resetting an object to its initial state is costly, no matter at the machine side or the human side (fine tuning for multitude types r costly), while otherwise properly overwritting existing properties/fields with new business values r complicated and error prone too. my proposal requests a low-level "renew" implementation in the runtime, better through unsafe codes, e.g. wiping off the memory buffer of an object directly as well as a fast lookup of objectpools for different types, which might be equivalent of a dictionary.

However, according to my experiment & benchmark, an implementation of a generic objectpool( ConcurrentDictionary<string, ObjectPool>) , which serves all known/unknown types, is a performance disaster, cos it took a lot of time to lookup the pool and to cast the instances to requested types.

GenericObjectPool is much worse than DefaultPoolGetObject and new() in performance, but is quite adorable to reduce Allocation and reduce complexity in planning various ObjectPools. Method N Mean Error StdDev Ratio RatioSD Gen0 Allocated Alloc Ratio
DefaultPoolGetObject 1000 4.065 ms 0.0365 ms 0.0341 ms 1.00 0.01 867.1875 2.59 MB 0.57
GenericPoolGetObject 1000 61.546 ms 0.5951 ms 0.5566 ms 15.07 0.23 777.7778 2.59 MB 0.57
NewObject 1000 4.084 ms 0.0432 ms 0.0404 ms 1.00 0.00 1523.4375 4.58 MB 1.00
jkotas commented 7 months ago

we need expert on this, to build object/memory reusing to the foundation of dotnet memory management rather than a design pattern

Experts looked at this problem before and did not see any promising solutions.

xiaoyuvax commented 7 months ago

lololol, should i close this thread quietly... :`