flutter / flutter

Flutter makes it easy and fast to build beautiful apps for mobile and beyond
https://flutter.dev
BSD 3-Clause "New" or "Revised" License
165.05k stars 27.21k forks source link

[OOM] memory consumption is too high #48305

Closed kf6gpe closed 4 years ago

kf6gpe commented 4 years ago

https://b.corp.google.com/issues/144232910.

/cc @cbracken @chinmaygarde @jason-simmons

Hixie commented 4 years ago

Status: We believe we have done what can be done and are waiting for feedback from the customer.

dnfield commented 4 years ago

I'm working on a proposed change to an internal image class that should also improve things, and waiting for review to finish on #48536

kf6gpe commented 4 years ago

Latest comment on the bug is

would appreciate help with measuring and limiting memory spikiness
when dealing with images.  memory consumption in steady state is
okay-ish, but during intense image loading/unloading it can grow
unbounded. For our applications it's generally good to have some sort
of memory cap that the engine may not exceed, and expensive
allocations should not run in parallel.

(edits for confidentiality)

chinmaygarde commented 4 years ago

and expensive allocations should not run in parallel

By the time the engine sees the request for an image decompression, the compressed image buffer has already been created (i.e, the engine does not perform the job of actually fetching compressed image data). Not utilizing available concurrency on the device to run decompression jobs in parallel would only hold onto these buffers for longer. If the application is requesting the decompression of too many images, these requests (along with their buffers) would get backed up and lead to OOMs. So, I’d push back on mechanisms to artificially limit use of available compute resources to ameliorate peak memory usage.

loading/unloading it can grow unbounded

This should depend on the number of image decompressions requested by the application but not serviced by the engine. Capping the number of pending decompression jobs in the application/framework should fix this issue.

One potential engine feature for helping with memory usage would be for the engine to keep track of various buffers managed by the image decoder and refuse to decompress images if a (configurable) cap is hit. But, this still leaves the responsibility on the application to find and release “unused” (because the engine could still be using them even if the application is not) images, wait for the images to be collected (all collection is asynchronous) and then retry the decompression job. Yet another strategy could be a bounded LIFO queue for decompression jobs in the engine. This too has implications for application authors to retry decompression in case of rejection of the work by the engine. Personally, these solutions seem like band aids. If the application want to restrict the amount of memory it uses, not requesting the decompression of too many images seems like the most straightforward approach (and one that is entirely in the control of the application).

kf6gpe commented 4 years ago

https://github.com/flutter/flutter/issues/49316 for the specific question of unbounded growth.

alml commented 4 years ago

If the application is requesting the decompression of too many images, these requests (along with their buffers) would get backed up and lead to OOMs.

I think it's a bit of oversimplification of the problem. Flutter applications (I mean the Dart part) are declarative, they don't prescribe how many images the engine needs to load and decompress at once, and in which order. Their business is to tell what to show.

Flutter (I mean both framework and engine) knows enough information about which images are on the screen, which are coming, which are no longer needed. It can make reasonable decisions when to download them. Flutter needs to communicate this information to the embedder, so it could do its job better. I'm talking about dynamic prioritization and cancellation. I'll elaborate on that later.

I think we need to design the whole pipeline for image downloads with hard memory constraints in mind. As image data flows from the Internet to the screen, it passes a few stages: download from Internet, decompression into a buffer, resizing a buffer into another buffer, painting. Each stage should have a (configurable) memory limit and only allocate new objects when it has enough headroom. If it doesn't, it needs to wait for the memory to become available and apply sufficient backpressure to the previous stage to avoid backups. That's the key to the bounded memory consumption.

Now some ideas on how to make it work and minimize user-visible disruption due to delayed download. As I said before, Flutter knows what is currently on screen, and it could tell all the pipeline stages that certain images have the highest priority, certain images are low-pri. The further the image is from the visible screen (think of carousel items), the less its priority is. As a user navigates through the app, some images can come up to screen, leave the screen, or be removed from the widget tree completely. Each such change should be communicated to each stage of the pipeline, so it could make reasonable scheduling decisions: request queuing and dequeuing, throttling (effectively applying backpressure to the previous stage), cancelling operations that are no longer needed.

This would obviously require revision of the APIs between Flutter, Skia and embedders, implementing HTTP cancellations, throttling, etc.

I'm sure the problem is more complicated, but I believe only Flutter has enough information to orchestrate everything.

dnfield commented 4 years ago

@alml I suspect we're already doing more or less what you're suggesting in some ways. One place we're not is when we scroll rapidly in a list - we're asking to decode images that won't appear on screen (or will only appear on screen for some imperciptible amount of time).

The issue isn't always about priority either - for example, if the application asks the engine to allocate 2gb of memory for image textures, it will just do so. That can happen because you request 1 very high res image, 2 very high res images, or 1000 images as a user is flicking through a gridview.

I am working on improving the grid/listview issue. The framework can help prevent you from shooting yourself in the foot here more than we do right now. But once you send that request over to the engine to start decoding, it will start allocating memory - and it doesn't really matter what priority you give it, the memory will get allocated.

As far as limiting the number of images that can be downloaded/decompressed/etc. That's really an application decision. Flutter makes it very easy to create lazily built infinite scroll views. If you know that you're going to showing lots of images in the view on a device with limited memory capacity, you can (at the application level) delay scrolling in a way that will make sense to the user - see for example the way Instagram works on an iPhone. The framework shouldn't be forcing you to do that either way - we can't pre-determine the right UX pattern for your app to make that work properly.

It may be helpful to have the engine track how much memory it's using for image textures and send a warning to the framework when that is reaching a configurable limit so that the application can make better decisions about what to do on a given platform though.

Hixie commented 4 years ago

The customer's bug is closed, so I'm going to close this one. For the memory cap idea, see #49316. For the deferred loading during scrolling idea, see #49389. Thanks.

alml commented 4 years ago

@dnfield I'm not sure we are on the same page regarding terminology. Do I understand correctly that your point is that Flutter applications (user Dart code) have to be aware of the image loading and decoding? If yes, then what are the APIs available to the application to control it?

  1. Download status notifications. The app needs to know when a download has started, when it received Content-Size header from the server and is about to allocate a buffer for the compressed image, when it received certain amount of data, finished download successfully or not.
  2. Decoding status notifications. The app needs to know when the image header is decoded, and an image buffer is about to be allocated.
  3. Cancellation. If an image is no longer needed by the app, it needs to abort download and decoding processes.
  4. Pause. The application needs to be able to put any download and decoding on hold until memory constraints are satisfied.

If the above mentioned APIs are available, then I agree, the app can implement efficient memory management itself.

What's your take on this?

PS. The customer's bug was closed with a temporary workaround. The cause is not fixed. Please reopen.

dnfield commented 4 years ago

@alml - I suspect we should open some new bugs rather than reopening this one. It is fairly vague and was only meant to track the internal bug that was closed. I'm opening at least one new bug so far based on this.

To your specific points:

  1. You can't reliably use the content-length header - it won't be supplied for chunked encoding, and it's not guaranteed to be supplied even when the server is sending over the whole image in the current frame. I don't think the framework should be managing this, but an application written network image provider could depend on it knowing what server(s) it will connect to and what types of encoding it should expect. I would be opposed to adding such logic to the framework, where in general use it is likely to be wrong or at least misleading.
  2. It would probably be nice to provide some way to get information about the image header to make decisions. Filed https://github.com/flutter/flutter/issues/50022 to track. This should be doable.
  3. I'm very skeptical about cancellation. Download cancellation is currently not supported on the VM (https://github.com/dart-lang/sdk/issues/22265), and if we supported cancellation of decoding in the engine it's highly likely that the application will be too late - texture memory will already be allocated by the time it tries to cancel. I think it's better to just not make the request until you're reasonably sure you'll need it - e.g. what the new ScrollAwareImageProvider does (https://github.com/flutter/flutter/pull/49389). At any rate, cancellation would always be best effort - you'd never be guarnateed that your call to abort or cancel would actually save any work (and it actually might cost you more, if we made it to a point where almost all work was done except e.g. returning a handle back to the framework).
  4. I'm not sure pause makes sense either, for similar reasons to cancel. But it would be interesting to explore some more - filed https://github.com/flutter/flutter/issues/50024
dnfield commented 4 years ago

I'm also working on the fix for https://github.com/flutter/flutter/issues/49456, which should help with the internal customer use case of wanting to avoid resolving images multiple times that do not otherwise fit in the cache.

alml commented 4 years ago

Thanks for opening the specific issues. It is definitely better than a vague bug. Please let me use this one to address your points.

(1) I agree, Content-Length is not always available. In case of chunked encoding or simply missing Content-Length with body interrupted by a closed connection we won't have this signal. We'll have to download those images in some fallback mode, but for the majority of static images I'd expect to have proper Content-Length and smart memory management.

(3) Cancellation doesn't have to have anything with the VM. HTTP download is an asynchronous operation. It should be possible to abort a HTTP download with some sort of cancellation token, cancellable context or a similar mechanism. Re decompression: I'm not talking about aborting an ongoing decompression (it's difficult indeed), but I do talk about cancelling no longer needed decompressions while they are still in the queue.

(4) I don't see why it couldn't be implemented. Image decoding pipeline consists of a few quite isolated steps: decoding the header, allocating a buffer, decompression itself. You agreed in (2) to provide some hooks to notify the application about the decoded header. Consider multiple big images are being downloaded in parallel. Each of them is relatively compact while compressed, but has a large resolution. The application receives a callback from the first image, sees 3000x3000, says "that's fine, go ahead", and then Flutter allocates the 36M buffer and decodes the image. Then it receives another notification "header is decoded for another 3000x3000". The app needs a way to say Flutter: "whoa! hold on!" until the first image is decompressed, resized and discarded, then it says "now continue with the second image". This way we can avoid allocating 2 large buffers at once.

Does it make sense?

dnfield commented 4 years ago

This does make sense, yes.

For the first point about network images, I think we're basically in agreement - but I think that it's an application decision to make rather than something to put on Image.network, which doesn't have a good way to return anything to the user. For example, you might structure your application to acquire the network bytes in a way that's aware of content-length headers and has fallback modes, and then just uses Image.memory.

Cancellation does have to do with the VM - the VM doesn't support it, so Flutter can't without adding that feature to the VM. As far as cancelling unneeded decompressions that are still in the queue, I have doubts about whether an application would actually be able to cancel anything in time - and if you're pumping so many images into the queue that it's taking multiple frames to decode them, you're probably pumping too many images and should have throttled it earlier. But it's hard to determine that right now.

You can read the header of an image and get size information without ever decoding it fully. I don't think we should slow decoding by adding hooks to it - but we could instead provide some routines (or recommend packages, like package:image that do this) to determine an encoded image's size.

alml commented 4 years ago

(1) I think most of the Flutter application developers would benefit from some generic solution should it be available. Altering Image.network or leaving 100% up to the application are the opposite poles. I'm sure there is some middle ground. If the memory management API was carved carefully, it should be possible to provide some default implementation with Flutter framework and give application developers an option to override or replace it.

Cancellation does have to do with the VM

I'm not a Dart expert, but I'm sure there are some means for cross-isolate communication. Essentially what Flutter needs is to send some message to the I/O isolate with updated instructions what to download. When the I/O isolate finishes its current operation it can "check mail" and update its queue accordingly. Do I misunderstand something?

You can read the header of an image and get size information without ever decoding it fully. I don't think we should slow decoding by adding hooks to it - but we could instead provide some routines (or recommend packages, like package:image that do this) to determine an encoded image's size.

Do I understand correctly that you propose the following flow:

  1. Application requests image download without decompression.
  2. Application gets a callback when then buffer is ready.
  3. Application queries the image size without decoding.
  4. Application performs a delay as much it needs.
  5. Application requests image decoding by inserting the special widget "ImageFromBuffer" to the widget tree.

That might work. Is it what you meant?

dnfield commented 4 years ago

I'm referring to the fact that the Dart http API has no cancel or abort. We don't otherwise queue or manage network requests.

Yes, that is more or less the flow I'm proposing.

alml commented 4 years ago

Re cancellation: It's okay if a request can't be interrupted if it's already running (due to the shortage of Dart HTTP API). Other requests that are waiting in the queue can be however cancelled. Right?

Re decoding from Image.memory: is there anything that needs to be done in Flutter before we could use it?

dnfield commented 4 years ago

Cancellation: Right now, Image.network does no queueing. As soon as the state is inserted into the tree and resolve is called on the provider, it makes an HTTP request.

Decoding: No, this should be fine out of the box - but it means avoiding the use of Image.network and NetworkImageProvider in general.

alml commented 4 years ago

Is it also possible to decode image dimensions from the byte buffer?

dnfield commented 4 years ago

It should be but we have no API for it today

alml commented 4 years ago

Should we file an issue to implement this API?

dnfield commented 4 years ago

That's #50022. Package:image might be able to just work for that too

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. If you are still experiencing a similar issue, please open a new bug, including the output of flutter doctor -v and a minimal reproduction of the issue.