Open dustingetz opened 10 years ago
Just found out about this project and curious about this. Memory issues were my immediate concern when thinking about how this works and noticed this issue. Would there be a way to solve this using something like immutable.js?
The datastructures are not the thing that is leaking, here. The leak is due to the memoizing cache policy.
It could be great to use ES6 WeakMap for cache to avoid memory leak.
There are some polyfill libraries, for example this one: https://github.com/Benvie/WeakMap
Sorry that I misunderstand the WeakMap ES6 feature (I suppose it is WeakReference feature.)
But, I find that WeakMap feature can be used to resolve your memory leak issue too. :smile:
First of all, the cmp
variables inside all Cursor instances are the same. The path
variable is used to navigate to a sub-object of global single state (the value
variable in code). So, the only necessary variable provided for cache is the value
variable. (The cmp
and path
is not a need for this line)
Then, rethink what is a cursor - it is an extension attaching on every sub-object of the global single state. The attaching word means, if the state object is going to be GC, then such extension is useless anymore and should be GC too.
Such attaching relationship exactly meets the actual use of WeakMap!
From the above definition, the cursor cannot hold a reference to any state object. But instead, the state object holds a reference to the cursor.
Concretely, the relationship between the three classes are:
this.state
But, of course, the cursor can calculate the state object according to its holding component.state and path. Again, no direct reference.
value
parameterIn such case, the cursor should not accept the value
parameter anymore, both in its constructor and build
function.
For backward compatible, the build
function can call cmp.setState(value)
before pass to Cursor instance. For the Cursor constructor, the value parameter should be deprecated.
Finally, let me back to the origin memory leak issue - cache - why do we need cache?
From the component aspect, the shouldComponentUpdate
function needs re-using the cursor instance if its corresponding state object is the same. So, state and sub-state object -> cursor
needs cache.
From the cursor aspect, it does not hold the state object directly, but instead needs to calculate it from component.state and path. Each calculation needs O(height-of-state-object)
operations. It could optimized to O(1)
if using cache. It indicates component.state + path -> sub-state object
.
Both of these two caches need WeakMap to avoid strong reference to the state object, and only the state object weakly holds reference to the cursor instance. Once the state object is changed, there is no references holding the original state object and its corresponding cursor instance in the WeakMap is GC too.
Memory issue is resolve! :smiley:
The WeakMap has a limitation, its cache key must be a non-primitive type.
For our second case, it can ensure that both component.state
and path
(an array of strings) are non-primitive objects.
But, for the first case, the state leaf, it always be a primitive object, e.g., String, Number, Boolean, etc. The cache does not work with this types, and always report that, such key not is the cache. So, it always create a new cursor instance for such case.
A new cursor instance will break the shouldComponentUpdate
ref-check. But, luckily, because those cursor are actually holding primitive types, shouldComponentUpdate
can check the primitive type directly instead of using ref-check to get a right check.
When implementing the idea, I find a comment:
When refining inside a lifecycle method, same cmp and same path isn't enough. this.props and nextProps have different subtree values, and refining memoizer must account for that
This comment is checked in one year again (61aace7).
In my idea, the cursor does not hold the value directly - it calculates from component when needed. So, in lifecycle mehod like componentWillReceiveProps(nextProps)
, although this.props.cursor
and nextProps.cursor
are different objects, but they holds the same component and same path - this.props.cursor.value
and nextProps.cursor.value
is the same.
Pandora is back. :disappointed:
OK, it is OK. My previous idea resolve memory issue but get another issue back. So, I proposal another way to resolve these.
I draw a relationship for the current implementation:
Root component --this.state--> State object
State object --cursor cache--> Cursor object
Cursor --this.value--> State object
As you see, the state object and the cursor object hold cycle reference which cause memory leak. In the previous idea, I break the cursor -> state object
reference to resolve memory leak. But, finally get into trouble.
The second attempt, holds cursor -> state object
and break state object -> cursor
. In such case, the root component needs to update:
Root component --this.state--> Cursor object
Cursor --this.value--> State object
Then, in the Cursor's update
function, instead cmp.setState(nextState)
, use cmp.setState({ cursor: Cursor.build(this.cmp, [], nextState) })
.
We cannot use cache for state object -> cursor instance
optimization, otherwise cycle reference happens and memory leaks. So, each Cursor.build
call returns a new Cursor instance. Ref-check in shouldComponentUpdate
does not works any more.
Fortunately, React immutability helper will try to re-use the original object if it is not changed. So, we can provide Cursor.is(first, second)
to check they are holding an object. This API imitates Immutable.is(first, second)
. The shouldComponentUpdate
call leverage this new API and keeps the current public API unchanged.
Because the cursor instance holds the state object reference. In the life cycle method, e.g., componentWillReceiveProps(nextProps)
, this.props.cursor
and nextProps.cursor
holds individual state objects with right values.
WeakMap is not needed in such pattern. But, of course, it could be cool to store private class variables and other features.
setState
If this changes apply, the code will looks very similarly with Immutable.js + setState
.
But, with three differences:
setState
leaves this to user.react-cursor
provides very-easy-to-use APIs to update state object, hide the calls to setState
. Immutable.js suggests a helper function to do such thing.I have implement the second idea in these check-ins. Those are API breaking changes. I might think that you could not like to merge. But if yes, I could prepare a PR.
The mutations example has updated to meet the new APIs. Actually, there are just three lines modifications.
I have use the Chrome heap snapshot to validate that, memory issue is resolved in the mutations example.
First of all, the cmp variables inside all Cursor instances are the same. The path variable is used to navigate to a sub-object of global single state (the value variable in code). So, the only necessary variable provided for cache is the value variable. (The cmp and path is not a need for this line)
rootCmp must be hashed for: the case where an app has multiple state roots that are totally independent - can’t share cursor references
path must be hashed for: Suppose a table is sorted, the reconciliation algorithm may choose to not reorder the react components but only fixup their innerHtml
Sorry, it is a wrong example.
For the rootCmp example, what do you expect for the following code:
var data = { list: [1, 2, 3] };
var id = 4;
var App1 = React.createClass({
getInitialState() {
return this.props.data;
},
add() {
var cursor = Cursor.build(this);
cursor.refine('list').push([id++]);
},
render() {
var cursor = Cursor.build(this);
return (
<div>
<p>There are {cursor.refine('list').value.length} items in the list.</p>
<button id="click-me" onClick={this.add}>Add one</button>
</div>
);
}
});
var App2 = React.createClass({
getInitialState() {
return this.props.data;
},
render() {
var cursor = Cursor.build(this);
return <ul>{cursor.value.list.map(x => <li>{x}</li>)}</ul>
}
});
React.render(<App1 data={data} />, document.getElementById('div1'));
React.render(<App2 data={data} />, document.getElementById('div2'));
document.getElementById('click-me').click(); // should App1 and App2 update synchronously?
For the path issue, I still don't get the reason. Revisit it later.
We need more time to think about these ideas.
After thinking, I agree with you that for the current implementation, cmp and path are necessary.
While, my second proposal does not need memorize any more. The memorize code is removed in the check-ins. So, you could take a look.
@lijunle, I am finally ready to tackle this issue. Many simplifications have landed in master over the last month, that may help with this issue. Here are the two main leaks: first leak, second.
refToHash can be trivially implemented with WeakMap. This will make the memory leak vastly smaller as we will be retaining one int per reference seen, not the entire react component.
Is there more we can do? The new code has opened new possibilities, I think.
Essentially, WeakMap gives us Ref => Cursor, what we need is (Ref, Ref) => Cursor
I think we could even do dumb things like a least-recently-used cache in the memoizers with some parameter over how big to make the cache. Tune the parameter based on how big the app is.
I noticed that there are huge changes these days. I am reviewing and understanding the current implementation for caching issue. Please let you know if I find something.
I am considering also adding a special Cursor type for server side rendering, which does not allow cursor updates (which are typically only wired up to user interaction callbacks). This special case only needs to memoize on value (not path) and is thus compatible with WeakMap. However this solution is not fully general as it puts burden on the application code to not use cursor updates in server rendering.
From StackOverflow:
if you create a two level weak-map (store weakmaps on weakmaps), whenever a obj on first level is gced, you loose the whole second level (when a is gced, you loose b). if b is gced, you will still have a weakmap for a, which will only be there while there is another (a, something). does this solve your problem or am i missing something? – Walter Macambira 2 mins
Sorry, @dustingetz I reviewed your changes, it is hard for me to understand the code and predict the behavior. Maybe after you freeze the 2.0 code, I learn the logic and evaluate the usability. Please open/close this issue on your decision.
I think we can unit test this with window.performance.memory
in chrome, which is already the test runner (thanks @danielmiladinov)
Great information! Although that API is Chrome only, it could automate the detection and make it accountable!
Requires thought to fix.
Specifically, every cursor instance ever created for each (react component, path, value) is forever retained so if we see the same (react component, path, value) we can return the same reference.