Open sarah-seidman opened 4 days ago
Hi @sarah-seidman , how are you?
Thanks for reporting this. There's a huge diff in code since I wrote that statement into the docs. And if it's still valid, should be only for non-RTC mode (don't uses a queue internally), that is not the default.
I think that this kind of construct should be supported by the library. But I think that there are few users or not long running provesses, so the problem didn't come up until now. This behavior has low automated test coverage, we can also improve on this aspect.
Can you please provide a snippet that reproduces the error so I can quickly add to the tests and jump in?
Hi @fgmacedo, thanks for your reply! Here is an example that reproduces the problem.
import tracemalloc
import time
from statemachine import StateMachine, State
tracemalloc.start()
class MyStateMachine(StateMachine):
startup = State(initial=True)
test = State()
before_state = after_state = None
do_startup = (
startup.to(test, after="do_test")
)
do_test = (
test.to(startup, after="do_startup")
)
def on_enter_state(self):
self.before_state = tracemalloc.take_snapshot()
def on_exit_state(self):
time.sleep(1)
self.after_state = tracemalloc.take_snapshot()
diff = self.after_state.compare_to(self.before_state, "lineno")
for line in diff[:10]:
if "event" in str(line):
print(line)
sm = MyStateMachine()
sm.do_startup()
And some sample output running for a few seconds. This is similar to the output I observed from tracing memory usage in my actual code.
sarah@ubuntu:~$ python3 test.py
/home/sarah/.local/lib/python3.8/site-packages/statemachine/events.py:31: size=840 B (+840 B), count=2 (+2), average=420 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=536 B (+536 B), count=2 (+2), average=268 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=1464 B (+928 B), count=6 (+4), average=244 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event_data.py:69: size=720 B (+592 B), count=4 (+2), average=180 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event_data.py:75: size=0 B (-592 B), count=0 (-2)
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=1872 B (+408 B), count=9 (+3), average=208 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=2232 B (+408 B), count=11 (+3), average=203 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=2592 B (+408 B), count=13 (+3), average=199 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=2952 B (+408 B), count=15 (+3), average=197 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=3264 B (+360 B), count=16 (+2), average=204 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=3672 B (+408 B), count=19 (+3), average=193 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=4032 B (+408 B), count=21 (+3), average=192 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=4344 B (+360 B), count=22 (+2), average=197 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=4752 B (+408 B), count=25 (+3), average=190 B
/home/sarah/.local/lib/python3.8/site-packages/statemachine/event.py:34: size=5112 B (+408 B), count=27 (+3), average=189 B
For the time being, I found a workaround to be able to run my code without too much modification. I removed most of the "afters" from my state transitions and added this code to main. Not the most elegant solution, maybe, but it is working :)
while True:
for event in state_machine.events:
try:
state_machine.send(event.name)
except statemachine.exceptions.TransitionNotAllowed:
pass
Hi @sarah-seidman, just an update here... I was able to reproduce the issue.
I'm researching and trying some approaches but still working on this. It's a memory leak of some objects.
Here's a report of guppy3, so the lib on the "recursion" leaks ~3656 bytes per loop:
Partition of a set of 20 objects. Total size = 3656 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 7 35 1696 46 1696 46 dict (no owner)
1 7 35 1496 41 3192 87 types.FrameType
2 1 5 136 4 3328 91 dict of statemachine.event_data.EventData
3 1 5 112 3 3440 94 dict of statemachine.event_data.TriggerData
4 1 5 64 2 3504 96 inspect.BoundArguments
5 1 5 56 2 3560 97 list
6 1 5 48 1 3608 99 statemachine.event_data.EventData
7 1 5 48 1 3656 100 statemachine.event_data.TriggerData
Description
I've created a state machine to control an application that runs on an embedded Linux system and advances between states based on time or input from external sensors. It starts in a "startup" state, checks a few conditions, and if all is well, asks the user to complete a test. Based on the result of that action, we either ask for it to be repeated or advance to a different stage, where we wait for something else to happen, etc. This is a great package and has substantially simplified our logic. However, I am having some issues.
This is an excerpt from my code.
The code works as-is. That is, it takes the user through all the states as desired. However, my team recently noticed that after about an hour of running the state machine, our embedded device crashes. It seems that the state machine is leaking memory. I tested this by running
top
and watching the memory used by this process steadily increase over an hour. I also used tracemalloc to check what objects are consuming so much memory and saw that it was from line 34 ofevents.py
in this package. It seems like each time there's a recursion in my code, it adds another Event to the internal queue.I noticed today that there's a "hint" section in your documentation about recursion, exactly like what's in my code. However, I never saw a RecursionError (although, maybe one did happen and was written to the journal, but I couldn't read it since the device crashed...). The code runs fine, until (I assume based on testing), the device runs out of memory and crashes.
If not by recursing like this, how should I deal with this case in my code? In the above example, I want to stay in the "breath_test" state until one of the exit conditions is true. I changed my code to this in order to avoid recursion
However, when the logic cascades to the internal transition on the last line, the program just exits.
I see that in the code examples you provide, the state machines generally advance to the next state because an event is called by code external to the state machine. Is that the only correct way to use this library? I run the state machine by instantiating the object and calling the start function. Using recursion (incorrectly, I guess?), the state machine executes indefinitely until the final state is entered.
It is definitely desirable to have the state machine control itself once it starts, and trigger all its own events. Obviously, this causing some problems as I have currently implemented it. Maybe it's not really possible to use the library this way? Is there any way to implement this kind of logic without dangerous recursion?