Closed netarrow closed 6 years ago
Looks legit and realistic.
Caching was introduced due to .ToString() override in definitions, which in turn, used nearly everywhere across the library. For example, all logging calls definition.ToString(), that's how we saw initial performance degradation while fetching properties.
The fast fix is to rewrite all definition.ToString() methods to avoid using reflection, the other way around is to optimize reflection calls and leave ToString() impl across all definition the way they are.
That's a tough call to make right now, more time is needed to dig into all details. Ultimately, this needs to be solved. Thank you @netarrow for raising this up and going into all details, stellar work and thinking behind.
Frankly saying, I would rather pragmatically rewrite all definition.ToString() call using plain string, composing a plain string with properties. It's just hundred classes to refactor, not a big deal. Within giving constraints, it might be a lesser evil solution. We'll run a few more test trying to kill everything, simulating at least hundreds of thousands and then make a final decision.
Also, @netarrow , please be aware that with the incremental provision, by default, SPMeta2 stores information about the provisioned model in a web property bag. It is serialized info, md5 hashes per every model node. Consequently, as far as we remember, there is a 4Kb limitation for the web property bag value which limits the amount of data which can be stored. Hence, a really-really huge model won't work, SPMeta2 would fail to save incremental provision data due to this 4Kb limitation.
It might be a good idea to split huge models into smaller. Generally, it is always a good idea - modularization and smaller models work much better than anything big and monolithic.
Anyway, just a head up on your particular scenario and incremental provision.
Implemented initial refactoring.
All magic is housed here:
https://github.com/SubPointSolutions/spmeta2/blob/dev/SPMeta2/SPMeta2/Utils/ToStringResultRaw.cs
That should be sufficient to solve mentioned memory leaks and expansion of _cache4GetExpressionValueAsMethodCallExpression caused by reflection utils usage in .ToStirng(0 calls for definitions. Testing, looking into other codebase to see other potential improvements.
Initial testing didn't yield issues with the implemented refactoring. Closing this issue, running more tests before the major release. Current dev branch contains all changes.
@netarrow, really appreciate you took some time to dig into this and report. Hope this refactoring solves these memory issues and we won't have them anymore.
Hello,
I think could be a edge situation, but could be very memory consuming so I post this issue to understand if missing something or what could be done.
Brief description
Imagine a usage of Spmeta2 where a Intranet is generated starting from some data in excel, about 1000/2000 record that will produce for each row a document library, folders and subs folder (all with specific permission) and an item in a list.
Everything works fine and enought fast, about 5 minutes for record. But after few hours the total used RAM is 5Gb.
Doing some memory profiling no leak or typical memory issue is find out (using dotMemory) but the most expensive object in memory is:
Dictionary<object, Delegate>
I see it used in this class:
https://github.com/SubPointSolutions/spmeta2/blob/f54e29be0724aecc3ee90fbd7c58fdccc4f8eb82/SPMeta2/SPMeta2/Utils/ReflectionUtils.cs#L127
To cache the Compiled version of some lambda expression inside this method
https://github.com/SubPointSolutions/spmeta2/blob/f54e29be0724aecc3ee90fbd7c58fdccc4f8eb82/SPMeta2/SPMeta2/Utils/ReflectionUtils.cs#L135
Following the code I think that the critical path which pump up the memory is every Definition class which contains a ToString like that:
https://github.com/SubPointSolutions/spmeta2/blob/f54e29be0724aecc3ee90fbd7c58fdccc4f8eb82/SPMeta2/SPMeta2/Definitions/FolderDefinition.cs#L65
Which calls https://github.com/SubPointSolutions/spmeta2/blob/f54e29be0724aecc3ee90fbd7c58fdccc4f8eb82/SPMeta2/SPMeta2/Utils/ToStringResult.cs#L64
Which calls the GetExpressionValue with the cache used inside.
When a logger is used (Nlog in my case) which trace all the deploy activity, the ToString method on different Definitions is widely called, on each Node and ChidNode for example inside the the event OnModelNodeProcessed. After some time in situation with a lot of nodes the _cache4GetExpressionValueAsMethodCallExpression and _cache4GetExpressionValueAsUnaryExpression dictionary inside ReflectionUtils.cs could become huge.
Not sure on the analysis and not sure on a possible solution, at the moment I reduced the ToString calls on Definitions objects in my log reducing the details of the tracing and seems to be better, but maybe some others ToString calls happens since the memory still grow on Dictionary<object, Delegate>.
An idea maybe could be a threshold that limit the sizing of the dictionary. Or a flag that allow to disable the cache or a method to clean when necessary.
Thank you
SharePoint API
Which version of SharePoint runtime do you use? *O365
SPMeta2 API
SPMeta2.Diagnostic.DiagnosticInfo SPMeta2FileVersion:[1.2.17191.0958] SPMeta2FileLocation:[C:\Users\matt\Source\Workspaces\xxxxx\Prod\1.0_salesItaly\xxxxxxx\bin\x64\Debug\SPMeta2.dll] Convert(p.IsCSOMDetected):[False] Convert(p.IsSSOMDetected):[False]