Guidance on Very Large Bitmaps

grokys commented 9 years ago

I am writing an application that will have to handle very large bitmaps (the current largest is 16128x14960 pixels but we expect bigger).

Do you have any guidance on this? I am assuming that the correct way to go about handling such large images would be to use a downscaled image for viewing the whole thing and moving to tiles when the user zooms in?

What would the best way be to go about loading such tiles?

In addition, the bitmaps will need to be editable. Will this even be possible? The current version of this application uses GDI and although it struggles, it can just about manage. I'm hoping it will be possible with Win2D...

grokys commented 9 years ago

After a little experimentation, it appears that CanvasBitmap.LoadAsync fails to load the large bitmap, throwing an ArgumentException: Value does not fall within the expected range.

This is strange because WPF manages to load it. From my cursory reading of your blog, could I guess that this because it's trying to load it directly into GPU memory?

grokys commented 9 years ago

Indeed the exception arises in CanvasDevice.cpp:476:

ThrowIfFailed(deviceContext->CreateBitmapFromWicBitmap(wicConverter, &bitmapProperties, &bitmap));

It seems that WIC can load the image but CreateBitmapFromWicBitmap is failing, because the image is too large.

I'm guessing that you don't expose an API to deal with WIC images before they're passed to Direct2D? If so, what would be my best option? Use SharpDX/WIC to load the image, split the images into tiles there and use CanvasBitmaps.CreateFromBytes()?

Maybe you could expose an API to construct a CanvasBitmap with an IntPtr which represents an IWICBitmapSource? That would at least prevent me having to marshall the tile data from unmanaged -> managed -> unmanaged.

shawnhar commented 9 years ago

As you have discovered, Win2D doesn't do anything to automatically virtualize images larger than the GPU can support. We do have an item on the backlog to think about whether we should do something more to help with this, but a) it's a long way down the list, in the maybe-future-hypothetical-idea section, and b) I'm not entirely convinced we could/should do anything magic here in any case. We want to keep Win2D efficient, which means reasonably close to the metal, so I'm wary of trying too hard to abstract away this kind of hardware limitation.

You can query the maximum bitmap size via the CanvasDevice.MaximumBitmapSizeInPixels property. This is determined by the D3D feature level: it'll be 16384 for a feature level 11 GPU as found in most desktops, or 4096 for feature level 9.3 as found in Windows Phone.

There are basically two options for working with images larger than your GPU can support:

Tiling. Split up the image into smaller pieces and write the code to process each part separately. This can be straightforward or a crazy amount of work depending on what kind of processing you need to do.
Use the WARP software renderer instead of your hardware GPU, which supports ridiculously large bitmaps. In Win2D this is accessed by creating your own CanvasDevice and specifying CanvasHardwareAcceleration.Off. Obviously you won't get the perf of GPU HW acceleration, but WARP is pretty well optimized (using multiple CPU cores and SIMD instruction sets) so it can perform more than adequately for many purposes. WARP can be very convenient because it works identically to a HW GPU, but without the size restrictions.

If you go with the first option, you could use Windows.Graphics.Imaging.BitmapDecoder (which is basically a wrapper on top of WIC) to read large image files into a CPU side byte array, pieces of which can then be passed to CanvasBitmap.CreateFromBytes.

grokys commented 9 years ago

Thanks Shawn - it sounds like WARP may indeed be the way to go, at least initially. I will close the issue and re-open if I have any more questions/problems.

grokys commented 9 years ago

That was quickly re-opened! Ok, so if I create my own CanvasDevice, I can't use CanvasControl?

If that is the case, is there any documentation on how to use CanvasDevice directly?

shawnhar commented 9 years ago

When you use WARP, you become a two-device app. There is the WARP device you are using to manipulate these very large images, but still also the actual hardware GPU that is driving the display, used for XAML rendering and CanvasControl etc. To display it, you need to move the data created on your WARP device over to the hardware GPU device. There are a couple of ways to go about that:

If your Win2D usage all fits nicely on the WARP device, you can have this end with your final image drawn into a CanvasRenderTarget, use GetPixelBytes to read that final data, set it into a XAML WritableBitmap object via its PixelBuffer property, and then display the WritableBitmap using XAML. In this case there is only one Win2D device using WARP, while all the hardware GPU access is handled by XAML.
If you have other Win2D rendering that you want to do on the hardware GPU, you can use a CanvasControl in addition to your WARP CanvasDevice. Same pattern of calling GetPixelBytes on the final WARP device CanvasRenderTarget, but instead of transferring the data to a XAML WritableBitmap, use SetPixelBytes on a CanvasBitmap that was created using your CanvasControl hardware GPU device, then draw this CanvasBitmap onto the CanvasControl or combine it with other HW Win2D drawing.

Hope that helps,

grokys commented 9 years ago

Thanks again, Shawn - the WARP method seems to yield decent enough results.

The only thing I noticed was that I still need to do an unmanaged -> managed -> unmanaged marshaling of the memory from the WARP CanvasRenderTarget to the GPU CanvasBitmap.

Do you see that being something you might want to solve? Maybe something like SharpDX's http://sharpdx.org/documentation/api/t-sharpdx-databuffer might be useful here?

Let me know if I should open a new issue to discuss.

shawnhar commented 9 years ago

It's great to hear you got this working!

There's definitely more we could do to make moving data between different devices easier, but I'm not convinced some sort of data buffer abstraction is the right solution here. There are basically two problems we could try to solve:

Ease of use - it's a fair amount of code to implement this today, which could be simplified if Win2D provided a helper to do it for you
Performance - the more times the data gets copied, obviously the slower it will be. But there always have to be /some/ copies, since the data is moving from one device to another! I'll have to think more carefully about whether a smarter implementation would be able to elide any of these - my guess is probably yes, but it would be only incremental rather than order of magnitude perf improvement.

Whether objects are managed or native isn't really the main issue here - the perf overhead is the fact that a copy is happening at all, regardless of how the memory is managed. The fact that the intermediate storage is a managed array is only a problem if new arrays have to be allocated repeatedly (which would stress the garbage collector) but in this case you should be able to hang onto and reuse a single byte array across multiple such updates.

grokys commented 9 years ago

To be honest, it wasn't much code; I'm not sure it needs to be simplified.

However, when you say:

In this case you should be able to hang onto and reuse a single byte array across multiple such updates

Do you mean as things currently stand? Because at the moment my code looks like this:

var target = new CanvasRenderTarget(warpDevice, width, height, dpi);

using (var session = target.CreateDrawingSession())
{
    session.DrawImage(sourceBitmap, ...);
}

var bitmap = CanvasBitmap.CreateFromBytes(
        sender,
        target.GetPixelBytes(),
        (int)target.SizeInPixels.Width,
        (int)target.SizeInPixels.Height,
        target.Format,
        CanvasAlphaMode.Ignore);

As you can see, target.GetPixelBytes() allocates a new byte array on each call. Do you mean that this is something you could solve?

the perf overhead is the fact that a copy is happening at all, regardless of how the memory is managed

Yes, but unless i'm mistaken the copy only needs to happen once, from the WARP render target into the bitmap. At the moment as far as I can see, I've got an extra copy there into the byte[]?

It's not so important because as you say it probably wouldn't make a huge difference, but may be something you want to consider for the future.

shawnhar commented 9 years ago

Doh, of course GetPixelBytes returns a new array every time - don't know what I was thinking there :-)

It occurs to me that we already have an API that would be perfectly suited to this purpose: http://microsoft.github.io/Win2D/html/Overload_Microsoft_Graphics_Canvas_CanvasBitmap_CopyPixelsFromBitmap.htm

Except it won't actually work for you, because this is implemented as a wrapper on top of the equivalent D2D API, which does a hardware accelerated copy using the GPU and so can only copy between bitmaps of the same device.

I filed a backlog item for us to look at extending this implementation so it could also be used to copy between bitmaps of different devices. That obviously wouldn't be able to use the GPU, but we could make the same API work via a CPU copy.

grokys commented 9 years ago

That would indeed be perfect. I would have tried (and failed obviously) to use that if I'd seen it.

Closing the issue now, thanks a lot for all the help - really loving the new open MS! ;)

microsoft / Win2D

Guidance on Very Large Bitmaps #38