vinum-team / Vinum

A modern reactive state management library for correctness and speed.
MIT License
16 stars 1 forks source link

Decrease Memory Consumption #36

Closed xiyler closed 6 months ago

xiyler commented 6 months ago

It's important to work on ways to decrease memory consumption by Vinum.

Some important suggestions could be:

This issue is blocking.

xiyler commented 6 months ago

While the second suggestion (that relates to SEDMM) is a general optimization (we decrease memory footprint on every struct creation), the other two are mostly use-case-dependent optimizations, and could be implemented in various and different ways.


Specifically speaking on disabling value caching by state structs, this could be implemented by adding a Flow state struct that mostly sends down values towards its dependents, so that whenever a dependency of it updates, the Flow struct will send down the value to the dependents, but release the value after the update process ends.

Example:

local x = Value(1)
local x_power2 = Flow(function(node)
    return Use(node, x) ^ 2
end)

On(x_power2, function()
    -- since On's listener are ran during an update, x_power2 
    -- still holds the computed value, but in the case it doesn't,
    -- it will recompute the value, which is fine since the operation
    -- is mostly free.
   some_instance.some_property = Read(x_power2)
end)

While this works well (the original condition of not holding values is met), it could be painfully hard to decide whenever one should use Flow or Compute, and ultimately forces you to be always mentally aware of everything which isn't ideal and practical.

However, deciding to make Computes smarter about releasing their values when a condition is met is also met with a challenge, which is mainly deciding when a process is "fast enough" and the produced value is "lightweight enough". Flows moved this responsibility from Vinum to the developer, which as previously mentioned, isn't practical.

But, we could change the condition (or the heuristic, if you want to call it that) so that when there are only Listener(internal objects created by On operator) dependents present, we could always release the value, as it'll only be used in side-effects and not calculations.

xiyler commented 6 months ago

It's important to note that the condition where when only listeners are present relates to how Flow functions, as while when it updates, its values are present, they are released when its update process is finished, so any other independent update process is launched, the dependents will force the Flow to calculate which isn't also practical (and it would practically be functioning as a Compute, just with added costs).

xiyler commented 6 months ago

The decision for the previous issue has been finalized, which is to extend Computes' functionality to act as something like Flow when their dependents are only Listeners, where it will release its value when the update process is done.

xiyler commented 6 months ago

There is an important aspect of heuristic/conditional value caching that should be carefully understood. Since #37 implemented a relatively simple and semi-generalized (it doesn't work for Values, and shouldn't anyway) implementation where if there are only iListener dependents (internal objects created by On), it's safe to assume we shouldn't hold the value after the update process ends.

This created three cases:

  1. First case, the object has other dependent types (Computes, Maps, etc) additionally to the iListener dependents. In this case, the value is cached.
  2. Second case, the object only has iListener, as such, the value is dismissed after the update process, however, the cost of reading is still the same as the first case under the condition that Read is called during the update process, and not after it is finished (this can happens in On's listener function if you used task.defer or task.spawn and yielded.
  3. Third case, which doesn't happen often, but when it does, it is usually in perf-critical code, which when the object doesn't have any dependents, and is often read by loops (RenderStepped, etc). The cost of recomputing can be avoided by marking objects that don't have dependents at all as "unreleasables", so that its cached normally.
xiyler commented 6 months ago

This is in. The optimization that is conditional value caching should only work with "Flow-like" objects that only care about sending values to iListeners, as such, we shouldn't care about recomputing values, and should perhaps only error/warn when the value is dismissed at the point of reading (because according the previous three cases, it's impossible to read when the value is released, unless you use yielding, which at that point, you should cache the value yourself in the function body.

xiyler commented 6 months ago

An important question to be made is should we cancel all CTasks of an object when its marked as releasable? Technically, speaking, InnerScopes overcomplicate this issue since while yes we "dismissed" the value, the object's RNode is still holding it in its cancelableTasks field.

However, an as equally important question as the previous one is, should we actually cancel all types of CTasks, or the ones that are only constructed by the InnerScope operator? Consider the following:

local y = scope:Compute(function(node)
    Asynk(node, function()
        task.wait(2)
        asynkWrite(node, 10)
    end)
    return 0
end)

On(y, function()
end)

Technically speaking, this tree will be marked as releasable and y will dismiss its value and cancel the asynk operation as well. As such, its important to mark CTasks created by InnerScope by some sort of a tag so that we can identify the CTasks we want to cancel in the event of dismissing a value.

xiyler commented 6 months ago

Update: this is in.

Not sure if i want to pursue the last one, at the moment, but as a preparation step, I'll work on renaming objects and types, and after that, I will do a little memory consumption comparison between master and pr-decrease-memory.

After that, #37 is ready to be merged.