Closed mainej closed 2 years ago
Hey @mainej thanks for all these discussion and code!
I have read all your discussion with @ingesolvoll in the glimt project, and I think the problem statement is fair enough: the current implementation of the re-frame integration makes it difficult to reuse the same machine to manage multiple states stored in different locations of the app-db, even though they share exactly the same machine definition.
In some OOP analogy, it's just like instead of being able to create multiple objects of from the same class definition, the class has to be redefined for each object.
I noticed a mismatch between the version of malli used in project.clj compared to the one in deps.edn. I think that will cause problems with the next release of clj-statecharts. I'd be happy to help figure out how to make that more maintainable, but that belongs in a separate Issue.
I'm not sure what you see, but in both of the files the malli version is 0.6.1
I have think through this for quite some time (I read the whole thing yesterday, but only have time to reply to you guys today), and tbh I think the patch is overly complex. The thing that we want to fix is we have to create multiple machines and multiple re-frame event handlers for the same use case again and again. Because we basically binds one pair of event handlers (init + transition) to one path in app-db.
So what about instead of passing in a hard-coded path
in fsm.rf/integrate
, we do not provide the path
during the call to integrate
, and only later pass the path
as a param of the init and transition events? This takes basically the same approach as your current code, but is way more easier to apply to the current implementation IMHO.
I have drafted a rough prototype in this commit: https://github.com/lucywang000/clj-statecharts/commit/fc330a9096f46be25fda73120457b216cf00f370. This involves some changes to the init and transition event param signature, so need to think about how to keep the backward compatibility.
What do you think @mainej @ingesolvoll ?
@lucywang000 your analysis is correct, the problem arises from binding a machine to one path. Your draft implementation in fc330a9 looks like it would fix that problem, almost. But what about delayed transitions? The rf/dispatch
callback in the scheduler doesn't use the new path argument. It would dispatch the machine's transition-event, but with the wrong number of arguments.
When you register the machine, there's no path in scope. The path is specific to one state, so it can't live on the machine. That's why I changed clj-statecharts.impl/execute-internal-action
to allow the state to carry the scheduler, not just the machine. (And why I assoc the scheduler on the state as the state is initialized, because the path is in scope then.) Let me know if you see another way to do it.
Otherwise, looks great! I could definitely make your proposal work.
Thanks for looking at this!
You both seem to have a pretty good overview of the situation, so I'll just bring in the thing I care the most about.
It's pretty easy for a project or library to roll its own version of the re-frame integration if necessary. I already did that for kee-frame. The only thing that is critical in that sense is that the core library has the right building blocks available.
The one thing I missed when doing the integration with kee-frame was the ability to optionally clean up resources (event handlers, state) when the machine is no longer in use. That bit is particularly important when there's a very high number of machines in rotation.
I'm not sure if thousands of event handlers as opposed to 2 is a performance issue for an app, but at some scale maybe?
@ingesolvoll, that's a good point. If you generate lots of machines you can still run into problems with fc330a9.
As we've discussed before, technically you can unregister event handlers. So, with the original integration and with fc330a9 there is a way to release the memory held by a machine, by unregistering its handlers. But, it can cause errors as callbacks and delayed events dispatch to deleted event handlers.
To summarize:
sc.i.re-frame | fc330a9 | sc.i.re-frame-multi | |
---|---|---|---|
# of handlers | 2 * the number of states | 2 * the number of machines | 3 primary + 4 utility |
Memory used by handlers | size of a machine * number of states | size of a machine * number of machines | n/a |
Memory used by app-db | size of states | size of states | size of states + size of machines |
Machine memory can be released easily | ❌ | ❌ | ✔️ |
State memory can be released easily | ❌ | ✔️ | ✔️ |
But what about delayed transitions? The rf/dispatch callback in the scheduler doesn't use the new path argument. It would dispatch the machine's transition-event, but with the wrong number of arguments.
When you register the machine, there's no path in scope. The path is specific to one state, so it can't live on the machine. That's why I changed clj-statecharts.impl/execute-internal-action to allow the state to carry the scheduler, not just the machine. (And why I assoc the scheduler on the state as the state is initialized, because the path is in scope then.) Let me know if you see another way to do it.
That's a good point. We can fix it by allow passing the state when calling schedule, so the dispatch function could make use o it:
(deftype Scheduler [dispatch ids clock]
IScheduler
- (schedule [_ event delay]
- (let [id (clock/setTimeout clock #(dispatch event) delay)]
+ (schedule [_ event delay state]
+ (let [id (clock/setTimeout clock #(dispatch event state) delay)]
(swap! ids assoc event id)))
During fsm init, we simply store the path in app-db in the state, e.g. under the _rf-path
key. Now the re-frame scheduler could get the path information when scheduling the event:
(defn make-rf-scheduler [transition-event clock]
- (fsm.d/make-scheduler #(rf/dispatch [transition-event %]) clock))
+ (let [dispatch (fn [{:keys [_rf-path] :as state} event]
+ (rf/dispatch [transition-event [path event]]))]
+ (fsm.d/make-schedule dispatch clock)))
I think this way is better because we store simple data rather than a stateful scheduler object in the app-db.
I would not worry about memory bloat when there are "thousands of different machines thus thousands of re-frame handlers and machines" in the same web application. Even if such a huge application exists, it would already consume tons of memory and the extra overhead of clj-statecharts would be relatively negelectable.
Yeah, I think I follow @lucywang000. Seems like a reasonable approach. And I agree—storing a scheduler in the app-db, especially many schedulers, is kind of gross.
I guess I always worry about changing protocols. You never know who's implemented them. But I suppose IScheduler
isn't really a public interface.
But I suppose IScheduler isn't really a public interface.
Among all the not-so-many user of this lib, there is unlikely anyone who has used this hidden interface.
btw I have just cut a 0.1.1 release to update the version of malli. Thanks @mainej for pointing this out.
I did not study the code in these PRs thoroughly enough as I'm having a rather intense week, but from what you are saying:
Storing the scheduler or anything that's not pure data in the client db can be a big problem. I know the org I'm working in right now would not consider a library that does that.
But then again, as long as the re-frame integration is completely external to the machine code itself, that isn't really an issue.
allow passing the state when calling schedule
Thinking about it more... this is they key. If the state were passed when dispatching the scheduled function, then I (or someone else) could maintain the code in this PR as a a separate library. The existing re-frame integration could remain unchanged. You could modify it as described in fc330a9 if you wanted to, but a custom integration wouldn't need that. (Though, putting on my devil's advocate hat, fc330a9 would be a breaking change because all users of the integration would have to change their transition-event dispatches.)
It sounds like you already have that change to the scheduler written. But, if you'd like, I can submit a separate PR. Let me know what you prefer.
I've been playing with this, and it's not working. If two delayed transitions start at around the same time, the first one works, but the second one's transition gets lost. I think I've tracked down the reason why.
In the code below...
(deftype Scheduler [dispatch ids clock]
IScheduler
(schedule [_ state event delay]
(let [id (clock/setTimeout clock #(dispatch state event) delay)]
(swap! ids assoc event id)))
(unschedule [_ event]
(when-let [id (get @ids event)]
(clock/clearTimeout clock id)
(swap! ids dissoc event))))
There's a problem with saving the setTimeout
id keyed by the event
. If two states enter the same delayed transition at the same time, they will both use the same event
. So, the setTimeout
id of the first one is replaced by the id of the second one. When the first one's transition happens, it ends up calling (fsm.d/unschedule scheduler event)
. But, this does the wrong thing now: it cancels the second one's timeout by calling (clock/clearTimeout clock id)
.
I'm not sure how to fix that. Any ideas?
@mainej I think it would make sense to have 2 PRs. One for the changes needed to the core in order to support any kind of integration. And one for re-frame integration fixes.
@ingesolvoll agreed. I've opened https://github.com/lucywang000/clj-statecharts/pull/8 so we can discuss the core changes separately from the integration changes.
@lucywang000, as mentioned in #8, I've created https://github.com/mainej/clj-statecharts-re-frame. It uses the code in #8 and includes a failing test demonstrating the problem I described above.
Shall we move this conversation to #8?
If two states enter the same delayed transition at the same time, they will both use the same event
Well, we can simply fix that by using (_rf-path, event)
as the key into the Scheduler.ids
map.
Ofc this would require adding the state to the IScheduler.unschedule
method as well.
update: I see you already do similar things in #8, I'll comment there.
Closing this, since we agreed in #8 that any re-frame integrations that use that new functionality can be maintained externally.
This creates another
re-frame
integration that stores both machines and states inre-frame
's app-db. This lets a single machine transition many states and avoids memory leaks when there are hundreds or thousands of states.I'll summarize here, but for a much more detailed discussion of the problem see here and here. /cc @ingesolvoll
In the original integration,
statecharts.integrations.re-frame/integrate
adds event handlers that close over both amachine
and apath
. Thepath
refers to a single state that is stored within the app-db. The event handlers can manage that one state, but if you want many states you run into problems.In my case I'm using state machines via glimt to track the lifecycle of thousands of separate HTTP requests. Each request has its own state, and there can be many requests in-flight at once. To monitor the progress of each request, I have to create a different machine for each one. I pass those machines to
glimt
, which callsstatecharts.integrations.re-frame/integrate
for each one. But, since that creates new event handlers, and the handlers are stored inre-frame
's global handler registry, I end up leaking the memory associated with the handlers and their machines and states. HTTP requests are usually short lived, so I end up with lots of memory consumed for machines and states I no longer care about.The code in this PR takes a different approach. First, you register a machine, which gets saved in the app-db. Then, you initialize a state, saying where in the app-db you want it to be saved. Over time you can initialize many more states, all using the same machine. Later, you can transition a state by saying where it is in the app db, where its machine is, and what transition you want to call on it. I believe this approach is similar to the concept of "services"—it uses the app-db as a stateful container for immutable states and machines.
With these changes, it's easier to manage memory:
A few more notes on the implementation:
glimt
. I've been working with @ingesolvoll, the maintainer ofglimt
to define this approach. Also see this gist which demonstrates how the newglimt
code would be used.clj-statecharts.impl
, which allows the scheduler to be on either the state (new) or the machine (as before). This lets each state manage how it will be resumed. This should be a non-breaking change. Even if you don't want to maintain the code in this PR as a whole, would you consider merging this part? Then I could maintain this integration in a separate library.:_scheduler
). Please correct me if that's wrong.clj-statecharts
docs! They're so clear and detailed. ❤️ If you're interested in merging this PR, I'd be willing to write docs for the new integration. Of course, any guidance about how to write the docs or where to put them would be appreciated.malli
used inproject.clj
compared to the one indeps.edn
. I think that will cause problems with the next release ofclj-statecharts
. I'd be happy to help figure out how to make that more maintainable, but that belongs in a separate Issue.