SubPointSolutions / spmeta2

SharePoint artifact provision for .NET platform. Supports SharePoint Online, SharePoint 2019, 2016 and 2013 via CSOM/SSOM.
http://subpointsolutions.com/spmeta2
133 stars 56 forks source link

Possible high memory usage by ReflectionUtils #1101

Closed netarrow closed 6 years ago

netarrow commented 6 years ago

Hello,

I think could be a edge situation, but could be very memory consuming so I post this issue to understand if missing something or what could be done.

Brief description

Imagine a usage of Spmeta2 where a Intranet is generated starting from some data in excel, about 1000/2000 record that will produce for each row a document library, folders and subs folder (all with specific permission) and an item in a list.

Everything works fine and enought fast, about 5 minutes for record. But after few hours the total used RAM is 5Gb.

Doing some memory profiling no leak or typical memory issue is find out (using dotMemory) but the most expensive object in memory is:

Dictionary<object, Delegate>

memorycheck detail1 detail2 staticreference

I see it used in this class:

https://github.com/SubPointSolutions/spmeta2/blob/f54e29be0724aecc3ee90fbd7c58fdccc4f8eb82/SPMeta2/SPMeta2/Utils/ReflectionUtils.cs#L127

To cache the Compiled version of some lambda expression inside this method

https://github.com/SubPointSolutions/spmeta2/blob/f54e29be0724aecc3ee90fbd7c58fdccc4f8eb82/SPMeta2/SPMeta2/Utils/ReflectionUtils.cs#L135

Following the code I think that the critical path which pump up the memory is every Definition class which contains a ToString like that:

https://github.com/SubPointSolutions/spmeta2/blob/f54e29be0724aecc3ee90fbd7c58fdccc4f8eb82/SPMeta2/SPMeta2/Definitions/FolderDefinition.cs#L65

Which calls https://github.com/SubPointSolutions/spmeta2/blob/f54e29be0724aecc3ee90fbd7c58fdccc4f8eb82/SPMeta2/SPMeta2/Utils/ToStringResult.cs#L64

Which calls the GetExpressionValue with the cache used inside.

When a logger is used (Nlog in my case) which trace all the deploy activity, the ToString method on different Definitions is widely called, on each Node and ChidNode for example inside the the event OnModelNodeProcessed. After some time in situation with a lot of nodes the _cache4GetExpressionValueAsMethodCallExpression and _cache4GetExpressionValueAsUnaryExpression dictionary inside ReflectionUtils.cs could become huge.

Not sure on the analysis and not sure on a possible solution, at the moment I reduced the ToString calls on Definitions objects in my log reducing the details of the tracing and seems to be better, but maybe some others ToString calls happens since the memory still grow on Dictionary<object, Delegate>.

An idea maybe could be a threshold that limit the sizing of the dictionary. Or a flag that allow to disable the cache or a method to clean when necessary.

Thank you

SharePoint API

Which version of SharePoint runtime do you use? *O365

SPMeta2 API

SPMeta2.Diagnostic.DiagnosticInfo SPMeta2FileVersion:[1.2.17191.0958] SPMeta2FileLocation:[C:\Users\matt\Source\Workspaces\xxxxx\Prod\1.0_salesItaly\xxxxxxx\bin\x64\Debug\SPMeta2.dll] Convert(p.IsCSOMDetected):[False] Convert(p.IsSSOMDetected):[False]

avishnyakov commented 6 years ago

Looks legit and realistic.

Caching was introduced due to .ToString() override in definitions, which in turn, used nearly everywhere across the library. For example, all logging calls definition.ToString(), that's how we saw initial performance degradation while fetching properties.

The fast fix is to rewrite all definition.ToString() methods to avoid using reflection, the other way around is to optimize reflection calls and leave ToString() impl across all definition the way they are.

That's a tough call to make right now, more time is needed to dig into all details. Ultimately, this needs to be solved. Thank you @netarrow for raising this up and going into all details, stellar work and thinking behind.

avishnyakov commented 6 years ago

Frankly saying, I would rather pragmatically rewrite all definition.ToString() call using plain string, composing a plain string with properties. It's just hundred classes to refactor, not a big deal. Within giving constraints, it might be a lesser evil solution. We'll run a few more test trying to kill everything, simulating at least hundreds of thousands and then make a final decision.

Also, @netarrow , please be aware that with the incremental provision, by default, SPMeta2 stores information about the provisioned model in a web property bag. It is serialized info, md5 hashes per every model node. Consequently, as far as we remember, there is a 4Kb limitation for the web property bag value which limits the amount of data which can be stored. Hence, a really-really huge model won't work, SPMeta2 would fail to save incremental provision data due to this 4Kb limitation.

It might be a good idea to split huge models into smaller. Generally, it is always a good idea - modularization and smaller models work much better than anything big and monolithic.

Anyway, just a head up on your particular scenario and incremental provision.

SubPointSupport commented 6 years ago

Implemented initial refactoring.

All magic is housed here:

https://github.com/SubPointSolutions/spmeta2/blob/dev/SPMeta2/SPMeta2/Utils/ToStringResultRaw.cs

That should be sufficient to solve mentioned memory leaks and expansion of _cache4GetExpressionValueAsMethodCallExpression caused by reflection utils usage in .ToStirng(0 calls for definitions. Testing, looking into other codebase to see other potential improvements.

SubPointSupport commented 6 years ago

Initial testing didn't yield issues with the implemented refactoring. Closing this issue, running more tests before the major release. Current dev branch contains all changes.

@netarrow, really appreciate you took some time to dig into this and report. Hope this refactoring solves these memory issues and we won't have them anymore.