SixLabors / ImageSharp

:camera: A modern, cross-platform, 2D Graphics library for .NET
https://sixlabors.com/products/imagesharp/
Other
7.48k stars 851 forks source link

Implement pluggable memory management #225

Closed antonfirsov closed 6 years ago

antonfirsov commented 7 years ago

Problem

Solution

Implement a pluggable MemoryManager, make the default pooling memory manager configurable.

Update: feature/memory-manager is the WIP branch for this refactor. It's having #431 merged. If anyone wishes to contribute, please file your PR-s against that branch. The tasks below are being updated according to the progress on that branch.

Tasks for beta-3

Probably 1.0

Most likely post- 1.0:

vpenades commented 6 years ago

Elavorating what I've posted here , I think it's not that bad an idea, I'ts a huge breaking change, but I will also be future proof:

So instead of doing:

var image1 = new Image<Rgba32>(ArrayPoolMemoryManager.Default, 512,512);

we woud do:

var image1 = ArrayPoolMemoryManager.Default.CreateImage<Rgba32>(512,512);
var image2 = ArrayPoolMemoryManager.Default.LoadImage("test.png");

Which is much less ugly.

Advantages:

In the future, some MemoryManager might need to work with a special purpose kind of image. So having a factory is the perfect place to transparently extend the API, scenarios I can imagine:

internal class FileMappedImage<TPixel> : Image<TPixel>
{
    //...
}

public class FileMappedMemoryManager : MemoryManager
{
    public override Image<TPixel> Create<TPixel>(int width, int height)
    {
        return new FileMappedImage<TPixel>(width,height);
    }
}
tocsoft commented 6 years ago

No don't like it. Thats why we have the option of passing a custom configuration into load/constructors etc.

If you want to load a custom set of memory for the lifetime of a single image then you would create a custom configuration with a new specific memory manager for that image and then free up any resources yourself when you release the image.

vpenades commented 6 years ago

@tocsoft my feeling is that the default "easy to use" API favours a certain use case of ImageSharp, which is running continuously as a web service, and makes it utterly complicated for all other use cases. It might happen that the current, default use case is exactly what you need... but it's miles away from what I need, and what I am proposing is something that I think is neutral for all users.

antonfirsov commented 6 years ago

@vpenades it's definitely not worth for us to introduce breaking changes of this manner for 1.0. There's nothing you won't be able to solve by using a custom Configuration instead of the default one. (edit: I mean one or more configuration instances.)

"Temporal pools", releasing memory, isolation of multiple different memory managers in a single process - all possible. Gonna add docs and samples for these scenarios.

JimBobSquarePants commented 6 years ago
var image1 = ArrayPoolMemoryManager.Default.CreateImage<Rgba32>(512,512);

This API is semanticly troublesome and unintuitive. If someone came to the library as a first time user I would be incredibly surprised if they logically surmised loading images in this way.

It's a graphics library, and, as such, the semantic focus has to centre around the Image<T> class.

my feeling is that the default "easy to use" API favours a certain use case of ImageSharp, which is running continuously as a web service, and makes it utterly complicated for all other use cases.

I strongly disagree here. Your proposed API gives us no advantage, that I can fathom, over passing a specific memory manager to an image constructor since any factory based logic could be internally derived.

vpenades commented 6 years ago

Okey then, I'll give it a try

0xGeorgii commented 6 years ago

Hi all, I faced with this memory consuming issue (you can see details here). If you need some help with the implementation I can grab some.

JimBobSquarePants commented 6 years ago

Hi @GeorgePlotnikov ,

@antonfirsov is point on this issue and is planning on working on it in the latter half of this month, I'm sure he'll welcome any use-case information you can provide. šŸ‘

antonfirsov commented 6 years ago

@GeorgePlotnikov even answering a few questions would be very helpful:

If you need some help with the implementation I can grab some.

I want to finish this work before the end of the next week. If you really have some time to join in now, let me know, and I gonna break down the remaining work into smaller tasks. The more help I get from the community, the more time remains for me to work on #411 (documentation)! šŸ˜„

antonfirsov commented 6 years ago

For everyone's invormation:

The work started with #431 is now being continued on the feature/memory-manager branch.

This is the API I'm targeting currently:

interface IBuffer<T> : IDisposable 
{
    Span<T> Span { get; }
    // No T[] is exposed!
}

interface IManagedByteBuffer : IBuffer<byte>
{
    // Exposing an array to work with .NET API-s consuming byte[]
    byte[] Array { get; }
}

// I suggest using an abstract class instead of an interface,
// because I don't want to make the methods and IBuffer<T> public at this point!
public abstract class MemoryManager
{
    // Let's keep the interface as narrow as possible! 
    // There will be shortcuts like AllocateClean() through extension methods.
    internal abstract IBuffer<T> Allocate<T>(int size, bool clear)
        where T : struct;

    // Mostly used by decoders working with streams.
    internal abstract IManagedByteBuffer AllocateManagedByteBuffer(int size);

    internal abstract void Release<T>(Buffer<T> buffer)
        where T : struct;

    public abstract void ReleaseAllRetainedResources();
}

@tocsoft I suggest adding MemoryUsageHint later, when we are more familiar with specific solutions like unmanaged memory and MMF.

vpenades commented 6 years ago

Some considerations:

What about the newly introduced Memory<T> as a wrapper of Span<T> , wouldn't IBuffer<T> and IManagedByteBuffer be a redundancy of Memory<T>??

Memory<T> article here.

What about mixing memory managers? As I understand from past conversations, if you want to create an image with a custom manager, it needs to be passed as an argument like this:

new Image(FancyMemoryManager, 512, 512);

So, under the same application context, it will be possible to create/load images using different memory managers.... and my concern is, what happens when you mix resources with different managers in the same pot.

I ask, because I've seen there's some "SwapBuffers" being used in some processors, so if the SwapBuffers swap memory created with distinct memory managers, it can lead to trying to release a buffer with the wrong manager.

Also, some processors seem to require extra memory, like creating temporary images, etc... which is going to be the policy here? will they use the memory manager associated with the "current image", or they will need a memory manager to be passed to them too?

antonfirsov commented 6 years ago

@vpenades

Memory as a wrapper of Span , wouldn't IBuffer and IManagedByteBuffer be a redundancy of Memory??

Memory<T> is not a wrapper of Span<T>. They are technically different views of the same thing. IBuffer<T> is the counterpart of OwnedMemory. I will consider switching to OwnedMemory as soon as it's on NuGet + I see more use cases & docs to understand it's reference counting logic better.

What about mixing memory managers?

I think it should be a goal to allow this, ~but not in the current iteration~. UPDATE: I was thinking of composite "mixed" memory managers. We should definitely investigate what happens if different memory managers are used, and interfere in the same process.

I ask, because I've seen there's some "SwapBuffers" being used in some processors, so if the SwapBuffers swap memory created with distinct memory managers, it can lead to trying to release a buffer with the wrong manager.

Good point, we need to check this behavior in processors, and define rules for buffer moving & ownership management. In some cases copying them might be inevitable.

will they use the memory manager associated with the "current image", or they will need a memory manager to be passed to them too?

Passing a MemoryManager to Mutate() and Clone() could be a solution.

There are lots of tasks and concerns. In the current iteration I would like to focus on the absolute minimum listed in "Tasks for Milestone 1.0". Even implementing this takes more time than I expected. It is a deep codebase refactor and brings tons of seemingly unrelated tasks like #469. We can add future points like Composite memory managers, or Memory<T> integration later in an iterative way through independent PR-s. As always, community contribution is welcome. If you feel some of those points are important, and urgent for your use case, feel free to grab them! šŸ˜„

vpenades commented 6 years ago

@antonfirsov

The worst case scenario I can foresee in the long run is this:

Let's say ImageSharp suceeds, and it leads to an explosion of ImageSharp based libraries; Obviously, it's up to how sloppy the developers of these ImageSharp derived libraries are, but you might end having derived libraries with these behaviours:

And at some point, a naive developer happily importing and mixing these derived libraries from nuget into a big application.

The issue about mixing managers is not only about how to do it, but also if it is allowed to be done.

From the point of view of a large application consuming ImageSharp derived libraries from different sources, it should be convenient to set an application wide memory manager... but if the API allows creating derived libraries that use a custom manager internally, overriding the application wide manager.... I think you can see my point.

In the end, I don't know how to solve this... we have developers with wildly different memory management needs, so some sort of configuration mechanism is required... but letting configure the memory management can lead to interoperability problems between derived libraries, which is a very bad thing in the long run.

antonfirsov commented 6 years ago

The situation is not that bad. I guess there is a solution evolving on my branch:

Let me know if I'm wrong, but I think these rules are sufficient to keep buffer ownership logic safe, and prevent different MemoryManager instances from interfering with each other.

vpenades commented 6 years ago

@antonfirsov Another curve ball (sorry, doing lots of tinkering with images lately)

I'm developing a small 3D scene structure, basic geometry, transforms, materials and textures for in-memory manipulation.

The scene contains nodes, the nodes contain materials and the materials contain textures, the graph allows things like two different materials sharing the same texture, The idea is that everything works with managed memory for ease of use and simplicity.

Now, since ImageSharp's image is IDisposable, the only way to properly dispose the textures is to make the material class IDisposable, and the nodes, and all the graph down to the tree root. And since texture instances can be shared, I would have to implement reference couting... this is definitely not what I want.

I know removing the IDisposable from Image, or even adding an image type not requiring IDisposable is out of the question, so I was planning to use my own bitmap container (managed, and not requiring Dispose), and only when I need to do some ImageSharp operation, do something like this:

using(var image = new ImageSharp(texture.GetBytes(),texture.Width,Texture.Height))
{
    image.Mutate(dc => dc.Resize(256,256) );
    texture.SetNewContent(image.Width, image.Height, image.GetBytes());
}

Would this be possible with the planned memory manager?

antonfirsov commented 6 years ago

@vpenades the method you are looking for is Image.LoadPixelData<TPixel>(data, width, height)!

Later, it will be also possible to wrap an existing memory area to be able to temporarily manage it as an "Image". Dispose() will have no effect in the case if memory ownership is external. This feature requires the availability of Memory<T>, so we can have a clean & standard API for that.

vpenades commented 6 years ago

@antonfirsov Okey, I've tinkered a prototype of the managed image class I would use here.

0xGeorgii commented 6 years ago

Hello @antonfirsov, apologise for the delay in the reply.

  1. Currently, I'm using a beta
  2. Yes I stopped stress my application
  3. I think I don't want to consume more than 200 MB in single thread mode
antonfirsov commented 6 years ago

@GeorgePlotnikov thanks for the answers!

Can you try if it's any better for you with the latest nightlies? Would be very helpful information for future fine-tuning tips!

0xGeorgii commented 6 years ago

@antonfirsov, okay will do it and come back with a feedback.

0xGeorgii commented 6 years ago

@antonfirsov below the result

image the process was finished with

An unhandled exception of type 'SixLabors.ImageSharp.Formats.Jpeg.GolangPort.Components.Decoder.EOFException' occurred in System.Private.CoreLib.dll
Reached end of stream before proceeding EOI marker!
antonfirsov commented 6 years ago

Thanks!

That's still quite too much of memory being eaten up :(

The exception that ended your experiment might be a Jpeg bug ... or a totally corrupt "jpeg", that can't be loaded by any library. Can you share the file?

0xGeorgii commented 6 years ago

@antonfirsov sorry, but not. I fetch files randomly šŸ˜„

vpenades commented 6 years ago

@antonfirsov

Checking the implementation of Buffer<T> IDisposable and Finalizer, I've noticed the finalizer is not disposing everything it should.

The current finalizer implementation here only unpins the GCHandle, but it doesn't free the actual buffer as the Dispose method does.

I am aware there's probably a lof of changes with the upcoming memory managers, so I don't know if this detail is being tracked already.

My point is that an array built from a memory manager must be considered as an unmanaged resource, which means it is eligible to be released in the finalizer, so I guess it would look something like this:

~Buffer()
        {
            this.UnPin();

            if (this.isPoolingOwner)
            {
                PixelDataPool<T>.Return(this.Array);
            }
        }

The problem is that, as far as I know, the GC does not guarantee the order in which managed resources resources are freed, so, it might happen that this.Array has been already freed if it was created by the "NullMemoryManager".

For my use case scenarios, I was considering to rely more on the finalizer, and less on the Dispose(), but I can see a loophole here (that is, to rely on the finalizer to release resources):

With the current implementation, where the finalizer only does UnPin();

Now, let's fix the finalizer and add the release of the array:

Anyway, I think the finalizer behaviour needs to be clarified, and in any case, It must free any resources that the GC cannot free on its own.

The only way I can think to easily fix this is the MemoryManagers to have a property that tells if the resources needs to be returned back to the manager, so for the NullMemoryManager, the Buffer would set isPoolingOwner to false, preventing the finalizer to do "weird things" with a potentially disposed object.

antonfirsov commented 6 years ago

@vpenades It's not allowed to touch managed objects in a finalizer! It's only here to free up the unmanaged handles if the developer forgot to call Dispose().

A developer should not rely on finalizers, you should properly manage the disposal of your objects.

vpenades commented 6 years ago

@antonfirsov Indeed!, that's precisely my point!

antonfirsov commented 6 years ago

Sorry, misunderstood your comment. It's a bug in my code! Gonna be fixed.

antonfirsov commented 6 years ago

I'm almost done, a PR is coming soon. However .. I'm not yet happy. Based on the feedback in this + other threads, there is some strange behavior I haven't understood yet. šŸ‘» The memory keeps growing for users even when I supposed it should stop doing so.

We need to do a proper load testing investigation before releasing beta-3!

antonfirsov commented 6 years ago

I'm closing this epic, because it's 99% done. Individual issues like #650 should cover the rest.