umco / umbraco-ditto

Ditto - the friendly view-model mapper for Umbraco
http://our.umbraco.org/projects/developer-tools/ditto
MIT License
79 stars 33 forks source link

Excessive memory usage #248

Closed leethomascook closed 5 years ago

leethomascook commented 5 years ago

Hi,

We have been using Ditto extensively on a large Umbraco 7.15 production site. Within about 12 hours of a reboot, we get upto 28 Million objects being retained in the DittoTypeInfo ConcurrentDictionary, using 1.25 GB RAM, which feels, excessive.

image

Honestly, I don't know if this is down to incorrect usage on our part or a bug. We ceartinly don;t have anywhere near 28 million property types.

If I drill into the memory dump a lot of the space used seemed to reference 'XmlElement' types, which would posisbly indicate it is keeping hold of some of the xml cache, somehow.

image

I'm happy to share the full 8GB memory dump ( or a 2GB dump when we took a snaphot earlier) privatley, if useful and do a screenshare so I can show the logic, but i can't share a copy of the repo unfortunatley.

Umbraco Version: 7.15 Ditto Version: 0.12.2 Number of nodes: 98,186 (obtained by running SELECT count(*) FROM [dbo].[umbracoNode] ) Number of models: Between 100 - 200, so not excessive.

Things that may be worth metnioning for context.

1) We have quite a few custom ditto attributes, some of which access UmbracoContext.Current - don't know if that is relevant. 2) We did have Ditto.LazyLoadStrategy = LazyLoad.AllVirtuals; set, but disabling it made no difference 3) We have 10 different language variations of the site, which have their own root node in the same umbraco instance. 4) We are using Darren F's Azure search Integration 5) We don't have any processors. 6) We are running on azure app service ( not umbraco cloud)

Example of our 'base model' that gets inheirted a lot.

[DittoLazy]
public class BaseUmbracoItem : IUmbracoItem
{
    //Field that fill in in cms
    public virtual DateTime PublishedDate { get; set; }
    //Field that fill in in cms
    public virtual DateTime UpdateDate { get; set; }

    public DateTime PublishDate
    {
        get
        {
            if (PublishedDate != DateTime.MinValue)
            {
                return PublishedDate;
            }

            return Created;
        }
    }

    public DateTime LastUpdated
    {
        get
        {
            if (UpdateDate != DateTime.MinValue)
                return UpdateDate;

            if (PublishedDate != DateTime.MinValue)
                return PublishedDate;

            return Created;
        }
    }

    [UmbracoProperty("CreateDate")]
    public virtual DateTime Created { get; set; }
    public virtual string Name { get; set; }
    public virtual int SortOrder { get; set; }

    ////Should really be 'TemplateId' but lots of const string hide this property.  Could be refactored.
    [UmbracoProperty("TemplateId")]
    public virtual string TemplateId { get; set; }

    [UmbracoContentTypeAlias]
    public virtual string ContentTypeAlias { get; set; }

    [UmbracoProperty("Path")]
    public virtual string Path { get; set; }
    [DittoLazy]
    [UmbracoParent]
    public virtual BaseUmbracoItem Parent { get; set; }
    public virtual int Id { get; set; }
    public virtual string IdAsString => Id.ToString();

    public virtual bool ShowInNavigation { get; set; }
    public virtual bool IsPriority { get; set; }

    public virtual bool IsCacheBusted { get; set; }

    [UmbracoUrl]
    public virtual string Url { get; set; }
    public virtual string Destination { get; set; }

  //  [DittoLazy]
    [NormalImageOrImageCropper]
    public virtual Image Image { get; set; }

  //  [DittoLazy]
    [NormalImageOrImageCropper]
    public virtual Image Thumbnail { get; set; }

    public virtual string Summary { get; set; }
    public virtual string Title { get; set; }
    public virtual string BrowserTitle { get; set; }

    public virtual string NavigationTitle { get; set; }
    public virtual bool DoNotIndex { get; set; }
    public virtual bool ExcludeFromSitemap { get; set; }

    [UmbracoProperty("CreatorId")]
    public virtual int CreatorId { get; set; }

    public string DisplayTitle
    {
        get
        {
            if (!string.IsNullOrEmpty(NavigationTitle))
            {
                return NavigationTitle;
            }
            return Title;
        }
    }

    [UmbracoTagList]
    public virtual IEnumerable<TagModel> Tags { get; set; }

    public IEnumerable<PickerBaseUmbracoItem> RelatedContent { get; set; }

    public virtual string MetaDescription { get; set; }
    public virtual bool IsRestricted { get; set; }
    public virtual bool IsTargetedAtChildren { get; set; }
    public virtual bool SupportsAmp { get; set; }
    public virtual bool PageRequiresLogin { get; set; }

    public virtual PickerBaseUmbracoItem RelatedSyndicationFeed { get; set; }

    //Leave this here as used by QPR
    [UmbracoChildren]
    public virtual IEnumerable<BaseUmbracoItem> Children { get; set; }

    [DittoLazy]
    public virtual IEnumerable<PickerBaseUmbracoItem> LanguageVersions { get; set; }

    [UmbracoProperty("key")]
    public virtual Guid? Key { get; set; }

    public Udi Udi => Key == null ? null : Udi.Create("document", Key.Value);
}

Example of an Attribute that gets used a lot:

public class UmbracoChildrenAttribute : DittoProcessorAttribute
{
    private string _xPath;

    public UmbracoChildrenAttribute(string xPath = null)
    {
        _xPath = xPath;
    }

    private void SetXPath(UmbracoContext umbracoContext)
    {
        var profiler = MiniProfiler.Current;
        using (profiler.Step("UmbracoChildrenAttribute SetXPath"))
        {
            if (_xPath == null)
            {
                return;
            }

            string[] _currentItemIdentifiers = {".", "$current"};
            string[] _currentSiteIdentifiers = {"$site"};
            string[] _currentPageIdentifiers = {"$currentPage"};

            foreach (var currentItemIdentifier in _currentItemIdentifiers)
            {
                if (_xPath.StartsWith(currentItemIdentifier))
                    _xPath = $"id({Context.Content.Id})" +
                             _xPath.Remove(0, currentItemIdentifier.Length).EnsureStartsWith("/");
            }

            foreach (var currentPageIdentifier in _currentPageIdentifiers)
            {
                if (_xPath.StartsWith(currentPageIdentifier))
                    _xPath = $"id({umbracoContext.PageId})" +
                             _xPath.Remove(0, currentPageIdentifier.Length).EnsureStartsWith("/");
            }

            foreach (var currentSiteIdentifier in _currentSiteIdentifiers)
            {
                if (_xPath.StartsWith(currentSiteIdentifier))
                    _xPath = $"id({Context.Content.Id})/ancestor-or-self::*[@level = 1]" +
                             _xPath.Remove(0, currentSiteIdentifier.Length).EnsureStartsWith("/");
            }
        }
    }

    public override object ProcessValue()
    {
        var profiler = MiniProfiler.Current;
        using (profiler.Step($"UmbracoChildrenAttribute ProcessValue {Context.Content.Name} {Context.Content.Id}"))
        {
            var umbracoContext = UmbracoContext.Current;

            SetXPath(umbracoContext);

            if (!Context.PropertyInfo.PropertyType.IsEnumerableType())
            {
                throw new InvalidOperationException("Can only use the UmbracoChildren attribute on lists");
            }

            var genericTypeDefinition = Context.PropertyInfo.PropertyType.GetGenericArguments()
                .FirstOrDefault();
            var listType = typeof(List<>);
            var genericListType = listType.MakeGenericType(genericTypeDefinition);
            var addMethod = genericListType.GetMethod("Add");
            var list = Activator.CreateInstance(genericListType);

            IEnumerable<IPublishedContent> children = Context.Content.Children;
            if (_xPath != null)
            {
                children = umbracoContext.ContentCache.GetByXPath(_xPath);
            }

            foreach (var publishedContent in children)
            {
                var childItem = publishedContent.As(genericTypeDefinition);

                if (childItem != null)
                {
                    addMethod.Invoke(list, new[] {childItem});
                }
            }

            return Convert.ChangeType(list, genericListType);
        }
    }
}
leekelleher commented 5 years ago

Thank you for all the info @leethomascook, it's very much appreciated.

I agree, I had the same initial reaction, looks very very excessive!

The drilldown report is very useful - thanks. I'm curious about the number of XmlElement objects too, as Ditto shouldn't be storing those, (obviously it is, but it's not intending to).

Here's the DittoTypePropertyInfo class... https://github.com/umco/umbraco-ditto/blob/0.12.2/src/Our.Umbraco.Ditto/Common/DittoTypeInfo.cs#L236-L248

It was designed to store the minimal amount of meta-data about the property, e.g. the PropertyInfo and any custom Attributes, as well as whether it's an enumerable or not. I can't (currently) see where the XmlElement objects would come into play (at that point).


What tool are you using to get the memory dump chart/report? I'd like to run similar reports on one of my Ditto sites. (Not that I use Ditto much anymore, generally opted to use ModelsBuilder these days).

leethomascook commented 5 years ago

I got the memory dump form azure app service 'Diagnose and solve problems' menu, and then I opened it up in JetBrains dotMemory - note - took several hours to open the 8GB file.

Yeah, it's a bit odd. Thanks for the class reference, I think I will downlaod ditto source and debug it to see what is allocating.

leekelleher commented 5 years ago

@leethomascook I've been looking into this. The issue is with any allocations that are made within a Processor's ProcessValue() method are lingering around in memory.

(In hindsight, my design of DittoTypeInfo and DittoTypePropertyInfo and class object references was an oversight.)

In previous versions of Ditto (v0.11.3 and earlier), the reflection work (to get the Processor attributes) was done at runtime, while this used less memory, the CPU usage was much much higher. Which is why I refactored it (and introduced) the DittoTypeInfo lookup - but obviously has a higher memory footprint. Bah! CPU vs RAM!


Are you able to reproduce the excessive memory usage in a non-production environment? How much do you know about .NET memory profiling and/or high-performance apps and/or Garbage Collection (GC)? If so, would you be willing to help me figure out a solution?

The guts of the issue is that the DittoTypePropertyInfo stores object instances its Processors property...

https://github.com/umco/umbraco-ditto/blob/develop/src/Our.Umbraco.Ditto/Common/DittoTypeInfo.cs#L243

These are looped over during mapping (deeper down in the .As<T>() call)...

https://github.com/umco/umbraco-ditto/blob/develop/src/Our.Umbraco.Ditto/Extensions/PublishedContentExtensions.cs#L316

At this point, the Value (and some context properties) is assigned to...

https://github.com/umco/umbraco-ditto/blob/develop/src/Our.Umbraco.Ditto/ComponentModel/Attributes/DittoProcessorAttribute.cs#L194-L196

Which is where I believe those object instances (e.g. the Value and anything allocated within ProcessValue()) are being retained in memory.

I've tried to null assign to the Value after ProcessValue() has been ran, but I'm not convinced that is enough to trigger the GC to remove the object from memory.

That's about as far as I've got with it. I'm open to help.

leethomascook commented 5 years ago

Hi @leekelleher Thanks for taking a look at this. I've been doing some digging as well and I can recreate it locally on my machine. By stripping it back to just one macro on a page with a simple controller and one model that has a few complex types.

Debugging memory issues is always painfull. By doing some snapshots, I can see that we were allocating about 50mb per request, which is a bit nuts.

image

You can guess where the requests were..

I'm happy to help - i've already spent a few days looking at it / banging by head against a wall. Will take your pointers on board and have another crack.

I think MOST of the stuff does get GC eventually, but I think some stuff is left around.

I'd also be happy to re-imburse you financially for your time looking at the issue . I've used a lot of your open source stuff in the past so happy to contract you for any time spent fixing the issue, should you have the capacity.

Drop me an email at lee at stadion .io and maybe we can work something out?

Thanks,

Lee

leethomascook commented 5 years ago

Having said that ' it think it gets mostly GC' - looking back at some profiles, that is not the case.

When you take a snapshot with DotMemory, i think it does a GC, so only things that are left, are things it can't release and 99% of the exmaples i see are the dittoTypeCache -> processors that reference back to Xml Nodes via that value field.

So if you have that version (where you null out the value) in a branch i can compile or a dll I can drop in on my machine, happy to try it out and test the theory.

leekelleher commented 5 years ago

Hi @leethomascook. Thanks for the latest info.

re: latest version - I've pushed up my current WIP to a branch called dev/excessive-memory-fml.

It also includes the dev/remove-disposable-timer branch changes - this was something I'd done a while back, found that MiniProfiler logging was a performance bottleneck, regardless of whether you had ?umbDebug=true enabled or not.

leekelleher commented 5 years ago

@leethomascook I've added few more commits to the dev/excessive-memory-fml branch. Hopefully this should reduce the memory footprint.

I've been testing locally, with dotMemory, and whilst my overall memory usage for my local site didn't really reduce much, (it's generally around 500Mb), the footprint and retained objects for DittoTypeInfo did decrease. I'm hoping it will have a more significant impact for you. 🤞

leethomascook commented 5 years ago

Great, thanks Lee - will give it a try later this evening.

leethomascook commented 5 years ago

Hi @leekelleher Apologies for the slow reply. I had to wait a few weeks before this got shipped ot production.

This was not the only change that was deployed, so it's not scientific, but the difference is quite startling. Memory usage gone from between 8-10GB to 3.7GB and seems a lot more stable.

Have a guess when we deployed...

image

So, I would say this is resolved.

Thanks for all your help!

Note: obviously 3.7 GB is still huge, but it's a very large site with 10 languages and lots of output caching etc.