ocsigen / eliom

Multi-tier framework for programming web and mobile applications in OCaml.
http://eliom.org
Other
298 stars 52 forks source link

Memory Management with Client-Side React #262

Open paurkedal opened 8 years ago

paurkedal commented 8 years ago

Apologise for a somewhat long issue posting, I believe it's a tough issue.

First, I think writing user interfaces with FRP is a bliss, though as far as I'm aware, there is a hard-to-handle memory management issue due to the lack of weak references in the JavaScript standard. In particular WeakMap is not sufficient to implement weak references needed for FRP. Weak references have been proposed, but there are security concerns about opening a side-channels between different agents, see e.g. this discussion. The issue was also briefly discussed on the ocsigen list some time ago.

I attach a contrived example which shows the issue. Open e.g. /reactive_mm/40 and click around while watching the memory consumption. Watching the memory consumption takes a bit patience, so I added a counter of the number of leaf signals which were updated. As can be seen this never decreases, so naturally the updates start to slow down after repeatedly switching pages without reloading the app.

What can be done to mitigate the issue? Ideally one would come up with a secure proposal for weak references, and lobby for its inclusion in an upcoming JavaScript standard. That's hard work, and will take time. For the time being, it would at least be nice with some advice in the documentation.

Daniel Bünzli mentioned a workaround in the mailing list posed cited above: When calling stop ~strong:true on a signal or event, will detach it from its dependencies, and trigger the same operation recursively for each dependent signal which becomes unused. This is a working solution for user-created react signals, though it's not very convenient for complex applications. Also, I don't think this option is currently available for signals embedded in the DOM via Eliom_content.Html5.R and friends.

I have one proposal which may just work. It only applies where leaf signals are tired to the DOM, but I think that covers the most important use of react on the client side. The idea is to use the DOM document as a GC root for the signals. To avoid changes to the react library, one could first implement finalizers for DOM nodes, and used these to perform strong stops on signals which were created via the Eliom R modules. These stops would then cascade to user-created react elements depended on by DOM nodes, while other react elements where left to manual tidying.

One issue would be that a strong stop can inadvertently stop a signal which is still to be used. Maybe it's possible to schedule the DOM-finalizers in-between the page (re-)construction work, to avoid this from happening for locally referenced signals, though I'm not sure how Lwt is tied to the event loop, so it might need some though. React elements stored in globals would still be subject to accidental stops though.

reactive_mm.eliom.txt

vasilisp commented 8 years ago

Also, I don't think this option is currently available for signals embedded in the DOM via Eliom_content.Html5.R and friends.

Indeed, we do not expose the strong stops for the signals used inside Eliom_content.Html5.R. But this is within our control, and I think we should expose them somehow.

The idea is to use the DOM document as a GC root for the signals. To avoid changes to the react library, one could first implement finalizers for DOM nodes, and used these to perform strong stops on signals which were created via the Eliom R modules.

Correct me if I am wrong, but I don't think there is a direct way to implement finalizers in JS. That would be another vector for observing the GC behavior, I guess. At best, we can observe when a node gets removed from the DOM tree, which is not the same, because the programmer may be planning to put it back. Nevertheless, it may be acceptable in practice to impose the constraint that R nodes get attached to the DOM once, and once they go away, we stop the signals.

paurkedal commented 8 years ago

Indeed, we do not expose the strong stops for the signals used inside Eliom_content.Html5.R. But this is within our control, and I think we should expose them somehow.

Sounds good.

Correct me if I am wrong, but I don't think there is a direct way to implement finalizers in JS. That would be another vector for observing the GC behavior, I guess. At best, we can observe when a node gets removed from the DOM tree, which is not the same, because the programmer may be planning to put it back. Nevertheless, it may be acceptable in practice to impose the constraint that R nodes get attached to the DOM once, and once they go away, we stop the signals.

That was what I had in mind, treating the DOM document as the only root set. But, I started wondering, does the R module have DOM or functional semantics? In either case, re-using nodes poses an issue. In the DOM case, it might be solvable, though: As long as we control all functions which lets one extract elements from the DOM tree, one could disable the finalizers for elements returned to the client code.

paurkedal commented 8 years ago

Now, this would be the easy part, the DOM-document finalizers: pandom_finalizer.mli, pandom_finalizer.ml.

paurkedal commented 8 years ago

After some reflection on this, I think weak references are not really needed for a satisfactory FPR framework, and by avoiding them, we both solve the memory management issue for JavaScript and give more predictable performance. We would need some systematic changes to DOM and Eliom libraries, and we'd need to address two issues with React: The liberal introduction of side-effects and the irreversible behaviour of strong stops.

For simplicity I'll just write about signals as events are analogous. A signal in React serves a dual purpose. On one hand it serves as dependencies for other signals, and on the other it can hold an impure piece of code which performs a side-effect. Let's call a signal observed if if it runs an impure function or serves as a dependency for an observed signal. Currently the way the library is used, React cannot know which signals are observed, as it needs to assume any function may be impure. This is unfortunate, since a signal only needs to be updated if it is observed, and more to the point:

If we can manage side effects, then we can break weak back-pointers when a signal becomes unobserved, and restore them when it becomes observed again. Thus, unobserved signals become subject to garbage collection, also on architectures which lack weak pointers. As an added benefit, only a minimal predictable set of side effects run in response to the change of a signal.

Conversely, when we tap the side-effect of a signal, we need to prevent it from being garbage collected. Arguably, side-effects are only needed for updating real resources, and these should be explicitly managed anyway. So we should be able to co-manage signal observers along with the resources they affect. In the case of web interfaces, the resource is the DOM document. It is visible, so it should be considered a real resource, as opposed to detached elements. So in this case, the DOM manipulators need to check if an element becomes connected or disconnected from the DOM document, and enable or disable observation of the related signals, respectively.

A sufficient change to React.E and React.S would be to add two functions

    val mute : 'a t -> unit
    val unmute : 'a t -> unit

mute s removes back-links to s from all it's producers. Unlike stopping a signal, it keeps the list of producers, allowing the operation to be reversed by unmute s. As long as we only need side effects on leaf signals, these can be muted and unmuted without unintentional consequences.

Effects are not really needed on non-leaf signals, as they can always be moved to a dedicated dependent signal. If I were to rethink the React API though, I would suggest to enforce this discipline in the library by only supporting pure map and lifting functions, removing trace, and only creating dependency back-links on demand when adding observers through a new dedicated interface.

Drup commented 8 years ago

cc @dbuenzli

dbuenzli commented 8 years ago

After some reflection on this, I think weak references are not really needed for a satisfactory FPR framework, and by avoiding them, we both solve the memory management issue

Well, you solve it by going back to manual memory management...

Didn't go through the details but this looks a little bit similar to the ref counting scheme (subscribe/unsubscribe) I thought about at a certain point, see 1, 2. I suspect it suffers from the same problems, i.e. it breaks the semantics and equational reasoning of react, see 3.

dbuenzli commented 8 years ago

xref https://github.com/dbuenzli/react/issues/24

paurkedal commented 8 years ago

@dbuenzli Thanks for the background, didn't realize there was already a discussion this summer.

Well, you solve it by going back to manual memory management...

True, though a) signals would be effectively pure and garbage collected, while only the observers would introduce the back-links when as long as they exist, and b) observers are only introduced update some kinds of resources, and hopefully that allows coupling their release to the release of the resource.

Didn't go through the details but this looks a little bit similar to the ref counting scheme (subscribe/unsubscribe) I thought about at a certain point, see 1, 2. I suspect it suffers from the same problems, i.e. it breaks the semantics and equational reasoning of react, see 3.

Ah, I don't like ref-counting, reminds me of hard-to track MM issues as you mention in one of those posts. But well, the scheme I propose would mean that at least within React, one would count references by looking on existing back-links, or keep an explicit counter. From the outside, though, I'd much prefer to see it as a garbage collection scheme, where updated resource represent the root set. In the case of reactive DOM, the management can be hidden form the user; I'm wondering if this is not the case in other domains, as well.

The issue with breaking signal semantics needs a solution though. It did occur to me than the unmute function above would need to also re-evaluate signals. I made naïve sketch. Naïve because I have not considered tricky cases like fix and switch.

dbuenzli commented 8 years ago

Ah, I don't like ref-counting, reminds me of hard-to track MM issues as you mention in one of those posts. But well, the scheme I propose would mean that at least within React, one would count references by looking on existing back-links, or keep an explicit counter.

Not sure this is different of what I was proposing, with my ref count thing you only had to do so at the leaves. Internally retain/release is done automatically. Still I'm not very enthusiastic about it because of the semantic breaks.

In the case of reactive DOM, the management can be hidden form the user

But then I also thought that you could do so with strong stops. So what's the problem with strong stops ? They still seem to me to be the best solution, in the sense that you only need to go back to manual memory management memory on weak (haha) platforms and you should hopefully be able to hide it from the end-users.

In any case, I shall soon resume to some browser work with react so it's a good time to get some ideas flowing in as I'll make a quick run into the current issues to at least fix the few bugs that are reported. I will consider more closely any of your proposal at that point, right now I'm a bit in other things right and don't have react in my brain.

paurkedal commented 8 years ago

Not sure this is different of what I was proposing, with my ref count thing you only had to do so at the leaves. Internally retain/release is done automatically. Still I'm not very enthusiastic about it because of the semantic breaks.

Yes, that's the same proposal, except I was hoping the semantic issue could be fixed by running updates of changed signals, as a try to do in the mute-sketch branch. To be sure, I don't think for subscribe/unsubscribe needs ref-counting semantics, since they should only retain a single impure function updating a certain resource.

(I do have some worries about subscribing to S.map impure_function signal, since it leaves the question of when it's okay to pass an impure function to S.map. I only hinted at it earlier, but maybe I should spill an alternative to make my view more clear:

module O : sig
  type t = observer
  val on_event : ('a event -> unit) -> 'a event -> observer
  val on_signal : ('a signal -> unit) -> 'a signal -> observer
  val enable : observer -> unit
  val disable : observer -> unit
end

and remove all impurities in S and R. But this is just the enforcement of best practises of how one would use subscribe and unsubscribe, and there may be better ways of formulating this API.)

But then I also thought that you could do so with strong stops. So what's the problem with strong stops ? They still seem to me to be the best solution, in the sense that you only need to go back to manual memory management memory on weak (haha) platforms and you should hopefully be able to hide it from the end-users.

My issue with strong stops is that they are irreversible, so one can't freely disconnect and reconnect nodes from the DOM document. I think this can cause problems even when using a purely reactive approach, since a switch might hold some s at some point, then switch to another signal, then switch back to s, at which point s has suffered a strong stop.

In any case, I shall soon resume to some browser work with react so it's a good time to get some ideas flowing in as I'll make a quick run into the current issues to at least fix the few bugs that are reported. I will consider more closely any of your proposal at that point, right now I'm a bit in other things right and don't have react in my brain.

Good to hear, and quite familiar about switching between different projects. I hope to be available for discussion when you resume the browser work.

dbuenzli commented 8 years ago

Le samedi, 20 février 2016 à 21:07, Petter Urkedal a écrit :

My issue with strong stops is that they are irreversible, so one can't freely disconnect and reconnect nodes from the DOM document.

Of course a strong stop should only occur once a node is finalized and I guess that by saying this I have now just moved the problem to detect node finalisation. Unless you never reuse nodes that have been disconnected (i.e. be functional is that an option ?). I think this is the kind of ideas I was chasing around here:

https://github.com/dbuenzli/remat/blob/master/src-www/br.ml#L403-L436 https://github.com/dbuenzli/remat/blob/master/src-www/brr.ml#L42-L71

though I have no idea who was supposed to trigger that finalizer event...

Daniel

paurkedal commented 8 years ago

By functional semantics, I think we'd mean elements which are copied as they are inserted into the document. Then we'd need to create a fresh observed leaf signal in the process which can be safely stopped when it leaves the document. In the case of S.switch one would need to be sure the copy takes place, as well, but I guess that's just how it needs to be. So, the problem is maybe just if reactive element have DOM semantics. Not sure how this is implemented in Eliom.

though I have no idea who was supposed to trigger that finalizer event...

Hmm, isn't it something like this you were looking for (see above):

Now, this would be the easy part, the DOM-document finalizers: pandom_finalizer.mli, pandom_finalizer.ml.

Might be included in a more savoury library if it's of use.

ihodes commented 5 years ago

Is this still an issue/are there best practices to avoid the sort of issues this leads to?