pytransitions / transitions

A lightweight, object-oriented finite state machine implementation in Python with many extensions
MIT License
5.54k stars 525 forks source link

Asyncio support #259

Closed lromor closed 4 years ago

lromor commented 6 years ago

Hello, are there any news on the asyncio support?

thank you, -l

aleneum commented 6 years ago

Hello @lromor,

unfortunately making transitions run async callbacks requires some major refactoring. Otherwise the async version would more or less copy big chunks of code from the core machine which is a lot harder to maintain.

However, we planned to do this for a while. To check if we are on the right track I got some questions for you:

lromor commented 6 years ago

Hello @aleneum , first of all, I forgot to thank you for your work, it's a very well done project. I will be a bit verbose on this post, sorry for that.

So, we are working on a real machine and we wanted to have a more consistent way to define its behavior.

Our problem

(it's just to share our view, it may be important for you in order to understand if we match or not transitions goals):

One of the biggest problems introduced by the async (or also with safe multithread) paradigm is that for big projects (without an FSM) a lot of states variables are spread around the program pinning down the machine behavior becomes a difficult task. Similar problems are encountered for example while building async/event driven web interfaces. It can really become a mess when you start triggering a lot of changes at the same time and you don't really know what is the flow of your program anymore (see flux paradigm). So we decided to introduce an FSM that centralizes all of this logic making the program hopefully more consistent and ease the testing process.

Regarding your questions:

  1. Short answer: Before/on_enter after/on_exit is in my opinion the way to go. Long answer: I think that for our use case, actions should be mapped as states. We require the possibility to change an action if, for example, during "running" a stop event occurs. Transitions should always be immediate. I think that modeling actions with transitions may introduce some flaws that would lead us to merge the states logic to the transitions one (e.g. we will need to define if allow or do not allow to transition from a transition to another state). In general, in order to avoid to keep track of all the asynchronous tasks spawned by a state on_enter/on_exit a better approach would be to allow an optional state's task. I can send you a complete example of what I mean.

  2. I think that the async loop should be managed externally. Mostly because there's no real reason to force this requirement.

What do you think?

-l

aleneum commented 6 years ago

Hello @lromor,

thank you for the detailed explanation.

I think that the async loop should be managed externally.

I assume this is a requirement for most use cases. In case this is not required one 'quick fix' things by 'synchronising' async calls with the help of run_until_complete for each callback indivually. This could be implemented easily and should work for case in which the only problem is an async API but the actual concept does not require async processing. I still consider that 'cheating' and also probably bad design for the majority of use cases.

Before/on_enter after/on_exit is in my opinion the way to go.

Okay thanks. Feedback shows that it is okay to do the event processing itself synchronously. It is usually about the callbacks which makes sense.

actions should be mapped as states

While it is true that transitions callback structure is designed around transitions it does not prevent designing FSMs around states as actions. Just make sure to use queued=True if you initialise machines and plan to have trigger calls within on_enter callbacks. This way every callback is processed sequentially. You just have to consider that a callback should return when an event has been triggered.

It's funny that your example image actually refers to opening and closing procedures since this is what we use transitions for. transitions controls a mechanical prototype (a smart door) by sending and receiving PubSub events. We use RSB and a rsb extension for transition. I can really recommend this message based approach since it makes the resulting architecture very flexible and follows the guide line to not let the state machine do the heavy processing but just maintain system states.

lromor commented 6 years ago

Hello @aleneum , I wanted to ask you if you are already working on this, if you need someone to test or if you need patches.

-l

aleneum commented 6 years ago

Hello @lromor,

there hasn't been any progress on that front so far. The major issue is to find a solution which does not end up to be a re-implementation of major parts of transitions's core classes. @ivanteresh went ahead and implemented an async machine which you can find here. From a brief look I would say this is pretty close to what you can get inheritance-wise with the current core architecture. I have not tested it though but It's definitely worth having a look at.

lromor commented 6 years ago

@aleneum

I think that is a very nice start. I was going to do something similar but didn't know which methods were to be overloaded.

To fully use it I'm gonna use some StateHandlers that automatically register a state task on enter while on_exit I can cancel that task and await for it to finish. In this way I can avoid rogue tasks spawning around.

Thank you @aleneum and @ivanteresh .

-l

lromor commented 6 years ago

Hello @aleneum I wanted to update you about what we did, just in case someone wanted the to achieve the same result.

In this sense, I wanted a nested asynchronous machine, as you pointed out, is not so easy to do that via mixins. So I had to override a bunch of functions similarly to what @ivanteresh did. We are happy about the result, we also added an async lock for the transitions in order to avoid callbacks race conditions, etc.. so we keep the promise that transitions must be immediate but we introduced the concept of a state handler which is spawned and killed on_enter and on_exit.

We also started adding some tests.

-l

idf commented 6 years ago

@aleneum I saw your code and it works reasonably good. Can you upgrade it use async/await syntax newly introduced in python?

aleneum commented 6 years ago

Hello @idf,

could you clarify which part of code you are referring to? About await syntax: As we strive for Python 2 and 3 compatibility, some newer syntax features cannot be used. However, in case asynchronous callback is integrated it should not make any difference for users of transitions.

idf commented 6 years ago

@aleneum I am referring to code in https://github.com/ivanteresh/transitions/blob/async_machine/transitions/extensions/asynchronous.py .

I see you are striving for 2 3 compatibility. Let's not use the new syntax.

I actually try out the async code, it works after some debugging. What's the plan of incorporating it into master branch?

ivanteresh commented 6 years ago

@idf Thx for testing. I also test my solution in my project, and it works good for me. But finally, i think it is not enough good to be the part of main library. It contains a lot of copy-past and rewritten core classes and methods, which will be hard to maintain in the future, in case if any code in core will be changed. Also, AsyncMachine can't take part in inheritance with other extension classes, as NestedMachine, GraphMachine, LockedMachine, etc.

I think, for now i keep this code in my fork brunch, until I find a better solution, which is really good for inclusion in the main branch. Unfortunately, now I do not have free time to do this.

ivanteresh commented 6 years ago

I think we should wait and take a look wait for @lromor solution. As he said, his solution looks more interesting.

paulbovbel commented 5 years ago

@aleneum, python2 goes EOL at the end of this year. Would you consider supporting async/await syntax?

tyarkoni commented 5 years ago

+1 for dropping Python 2 support in general. We could prominently indicate the last version compatible with Python 2 in the README, but if it's hindering development at this point, seems reasonable to drop 2 going forward.

aleneum commented 5 years ago

@paulbovbel:

@aleneum, python2 goes EOL at the end of this year. Would you consider supporting async/await syntax?

I already implemented asynchronous callback processing but never managed to publish that branch. I just did in dev-async. The build job fails cause -- surprise surprise -- the Python 2.7 tests fail. My plan is to get 0.7 out there and support async starting from 0.8. Both versions will be released more or less at the same time but 0.8 will only be supported for Python 3. This should be done end of this month or the beginning of next month.

ablakey commented 5 years ago

@aleneum any specific tasks I could dive into to help your async/await support efforts? Writing/fixing tests or whatnot.

ablakey commented 5 years ago

I've been digging into dev-async branch. I'm not sure the passing tests (Python 3.7) are valid. They seem to miss this asyncio warning:

python3.7/site-packages/transitions/core.py:121: RuntimeWarning: coroutine 'AsyncMachine.callback' was never awaited
  event_data.machine.callback(handle, event_data)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

Which when you look at core:121 you see that AsyncMachine.callback is not being awaited. This means we're not invoking the coroutine that's "returned" by the async function callback. Either we await event_data.machine.callback(...) inside an async function or we put it on an event loop.

This gets messy because it looks like while you were trying to hide all signs of asyncio support inside the extension, there will likely need to be conditions within core.py for if callbacks are coroutines.

I'm going to think about this and play with it a little.

ablakey commented 5 years ago

So I've got this fork (https://github.com/pytransitions/transitions/compare/dev-async...ablakey:dev-async), which "fixes" the problem. But I now have to think about if this is actually what we want.

What I think this means is that every callback is done synchronously in order (which is how the non-asyncio transitions works).Any callbacks that are async functions (coroutines), are run on the event_loop and block progression of the state machine until each finishes, in order.

This might be what we want. The state machine continues to function synchronously, but now you can do awaitable tasks within your callbacks and not block other parts of your program (like I/O).

I'm not an FSM expert. The main blurry area in my mind is the question, "do we want concurrency among the state machine callbacks?" (this would look like an asyncio.gather(*all_my_callbacks))

aleneum commented 5 years ago

Hi @ablakey,

I cannot reproduce these warnings. The Travis jobs also report no warnings (Python 3.7 is not tested though). I'd expect that a not awaited callback would actually cause invalid mock states. Have you tried to execute the async tests locally?:

alneuman:~/Workspace/transitions [dev-async|…2⚑ 6] $ tox -e py37
GLOB sdist-make: /Users/alneuman/Workspace/transitions/setup.py
py37 inst-nodeps: /Users/alneuman/Workspace/transitions/.tox/.tmp/package/1/transitions-0.6.10.zip
py37 installed: dill==0.2.9,mock==2.0.0,nose==1.3.7,pbr==5.1.3,pycodestyle==2.5.0,pygraphviz==1.5,six==1.12.0,transitions==0.6.10
py37 run-test-pre: PYTHONHASHSEED='3368391890'
py37 runtests: commands[0] | nosetests
..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 1102 tests in 21.404s

`OK`

Could you also run this code:

import asyncio
import time

from transitions.extensions.asyncio import AsyncMachine

ASYNC_DELAY = 2

async def await_false():
    await asyncio.sleep(ASYNC_DELAY)
    return False

async def await_true():
    await asyncio.sleep(ASYNC_DELAY)
    return True

before = time.time()
machine = AsyncMachine(states=['A', 'B', 'C'], transitions=[['go', 'A', 'B']], initial='A')
machine.add_transition('proceed', 'A', 'C', conditions=await_true, unless=await_false)
machine.proceed()
exec_time = time.time() - before

print("Time passed (s): ", exec_time)
assert machine.state == 'C'
assert ASYNC_DELAY < exec_time < 2 * ASYNC_DELAY

My output:

/usr/local/bin/python3 /Users/alneuman/Workspace/transitions/test.py
Time passed (s):  2.0017831325531006

Process finished with exit code 0
ablakey commented 5 years ago

Yeah I get the same as you. The tests pass and your snippet works. But this doesn't align with my understanding of how coroutines work in Python. I'm going to dig further and hopefully either find why the tests pass wrongly or that my understanding is wrong and needs to be adjusted.

aleneum commented 5 years ago

I can't say for certain but I'd assume your initial error has been caused by an older already installed version of transition. Unfortunately, the line of event_data.machine.callback(handle, event_data) hasn't changed since the last release so I cannot say for sure. In the current master/released version callbacks aren't awaited but just executed and the overridden methods AsyncMachine._before_state_change and such are never called.

there will likely need to be conditions within core.py for if callbacks are coroutines.

This is done in AsyncMachine.callback. You can mix async and sync callbacks as you like:

async def await_false():
    await asyncio.sleep(ASYNC_DELAY)
    return False

def sync_stuff():
    time.sleep(ASYNC_DELAY)

async def async_stuff():
    await asyncio.sleep(ASYNC_DELAY)

before = time.time()
machine = AsyncMachine(states=['A', 'B', 'C'], transitions=[['go', 'A', 'B']], initial='A')
machine.add_transition('proceed', 'A', 'C', unless=await_false, after=[async_stuff, sync_stuff])
machine.proceed()

The only case where awaitable functions are currently required are conditions. This requirement can be relaxed though.

ablakey commented 5 years ago

Okay I think I've got it figured out.

I see that your example and test builds the machine differently. I'm building the machine this way from a yaml file:

   states = [State(
        name=state,
        on_enter=value.get('on_enter', None) if value else None,
        on_exit=value.get('on_exit', None) if value else None,
        ignore_invalid_triggers=True,
    ) for state, value in config['states'].items()]

    machine = AsyncMachine(
        states=states,
        initial=config['initial'],
        before_state_change='before_state_change',
        after_state_change='after_state_change',
    )

So I think instead of my callbacks being invoked by the overridden functions inside asyncio.py/AsyncTransition, they're invoked by core.py/State which aren't async friendly.

Regarding the previous comment. Yes, you're right. That can all be encapsulated within asyncio.py if we reimplement/override the other 4 locations that call machine.callback with the gather pattern you use.

aleneum commented 5 years ago

they're invoked by core.py/State which aren't async friendly.

Uh, I see. Yeah, that's clearly a bug. Your pattern should definitely work.

ablakey commented 5 years ago

What do you think makes the most sense?

I see 7 locations in core.py that call machine.callback. I think that 3 are overridden within asyncio.py/AsyncTransition leaving 4 that I imagine, in some manner, can be reached. Do we just override the rest as well with classes like AsyncState ?

aleneum commented 5 years ago

To prevent more inheritance madness it might be better to introduce Machine.callbacks or such alike which can handle a list of callbacks. Machine can implement this method with just the standard iterator/loop and AsynchMachine can run asyncio.gather. This way we might also get rid of the other overrides except condition handling which expects a return value. I opened a dedicated issue (#341) for that.

paulbovbel commented 5 years ago

So, just a small update from how we ended up integrating transitions with asyncio. The current pattern in https://github.com/pytransitions/transitions/blob/dev-async/transitions/extensions/asyncio.py doesn't quite work for us.

So we've ended up with something like:

    def async_event(func):
        @wraps(func)
        def wrapper(self, *args, **kwargs):
            self._task_monitor.create_task(func(self, *args, **kwargs))
        return wrapper

    @async_event
    async def on_enter_some_state(self, event):
        await asyncio.gather(
            do_something_async,
            asyncio.shield(do_something_async_that_wont_get_cancelled_on_transition)
        )

Where self._task_monitor is something that belongs to the AsyncMachine and behaves like a trio.Nursery or an ayo.scope:

aleneum commented 4 years ago

Hi @paulbovbel,

thank you for the feedback. Could you boil your requirements down to some test cases? This way it might be easier to figure out how versatile async support could look like.

aleneum commented 4 years ago

I just updated (force pushed) dev-async with a different approach. Loop handling has been removed. Every Future/Coroutine is awaited which means that Event.trigger now returns a Future and must be awaited as well. I had to make use of contextvar (Python 3.7+) to manage task cancelling in Transition.execute. Without context information the task may cancel its parent task. Have a look at test_async to get an idea of how AsyncMachine can be used. Feedback welcome!

chrisjbremner commented 4 years ago

@aleneum I tested out your branch and it works great for my purposes, thanks for putting that together! What are the next steps for getting this merged and cutting a new release?

nurettin commented 4 years ago

Hello, this branch has been working well for me.

I found one very important parameter "queued" when I set it to true, new events aren't executed during transitions. I think this should be default or at least clarified in docs. Once it is set to true, the machine becomes a powerful abstraction.

I handle timeouts by creating async tasks with sleep inside, so we don't even need an async Timeout mixin.

All in all, asyncmachine is working great for me and hope it gets merged soon.

Edit: one last thing, I think ignored events don't need to be warnings, or warn on ignored event could be optional, because sometimes you can't control what the outside world is going to send.

Edit2: I made a large-ish machine with more than 10 states, some of them with on_enter, and more than 30 transitions before, after, conditions and unless and lots of timers/timeout events.

then I made 400 copies run simultaneously and there were no problems. Async state machines should be illegal.

aleneum commented 4 years ago

@nurettin,

thank you very much for your insights. async will be part of the next major release 0.8. The addition of Enums (especially Enums with strings) created some issues for the markup machine and graphing. When this is sorted out, 0.8 will be released. Update: I would love to squeeze the rework of HSMs in as well to finally support parallel states though...

I found one very important parameter "queued" when I set it to true, new events aren't executed during transitions. I think this should be default or at least clarified in docs.

Currently the docs state:

The default behaviour in Transitions is to process events instantly. This means events within an on_enter method will be processed before callbacks bound to after are called.

Do you have suggestions about how to make this behaviour more obvious in the doc?

From my experience, many users expect transitions to happen instantly. This is why we made queued transitions optional.

nurettin commented 4 years ago

@aleneum Thanks for the response I also noticed that in order to have only one state change at a time per async state machine, I had acquire an async lock in the prepare event, and release it at the finalize event. Don't know if this is of any use, just noting it here.

aleneum commented 4 years ago

@nurettin: Good point. Right now, this is expected behaviour. A new event can cancel a currently processed event and all related tasks. When queued=True, a lock is probably the way to go to synchronize event processing. For nested events things become more complicated. Whether an event should attempt to acquire a lock (or enter a locked context) depends on the state of is_subtask. Otherwise a nested even trigger might deadlock since its parent is still holding the lock.

nurettin commented 4 years ago

@aleneum good heads up, thanks, will keep that in mind if I come across nested state machines. I try to keep designs as shallow as possible for now.

aleneum commented 4 years ago

@nurettin: with nested events I mean 'events triggered inside other events'. Wording wasn't optimal, sorry.

aleneum commented 4 years ago

at last, transitions 0.8.0 includes async support for Machine and GraphMachine. HierarchicalMachine is currently not supported though.