Open MicahLyle opened 3 years ago
Addresses https://github.com/celery/ceps/issues/31
Second draft. I have not filled in the specification yet. That I'm planning to do next, and getting feedback on the higher-level direction of the CEP has been really helpful so far. This one aimed to reduce the scope a bit, and I temporarily have stored the previous draft in a different document for comparison and so that I could pull things back in as needed. I wanted to see a more simplified CEP, both in response to @thedrow's suggestion, but also to see how it would feel overall.
I'm really happy about the Rationale section. I feel like it was important to review the current landscape and give clear reasons as to why we'd roll our own library than use something else out there.
Also, last thought for now is that I have some formatting issues with RST/Sphinx that will need to be resolved at some point. I didn't find a quick solution to a couple, so I'll need to get more familiar with that, but I was running out of time for the day. I felt like getting another draft out was more important than fixing those for now, so ideally I'd try to get those resolved once I have a more polished/final draft out.
One other thought. There may be sections from the previous draft to merge back into the current draft and/or future drafts. However, I wanted to remove them for this second draft to see what it might look like with a trimmed down scope. I liked @thedrow's comment about reducing to scope to async/await
and supporting a future version of Celery that could interact with async/await
on a number of levels, so this draft explores a reduced scope in that regard. As mentioned in the comments on the changed files from a little bit ago, there are some sections I think I'd like to see included somewhere, whether this CEP or a future CEP.
@thedrow at this point, I'd say that I have an "initial draft", complete from the perspective of taking the material you had across the Jumpstarter issues (and code), trying to scope it a bit, and getting something out there. I haven't tried to come up with anything on my own at this point (minus the example at the end of the specification), but rather just tried to solidify and summarize what you've already done.
The Specification feels bare/basic and mostly copy pasted (with some edits and changes as I felt would be helpful) from the various work you had around the issues. I know they were rough guidelines, but there are certain things I don't fully understand, so it was hard to write about them besides just trying to put down what you already had and clarify things a bit.
Now that I've had plenty of time to really dig into your specifications, dig into the transitions
library, and do some other research on actors, etc., here are some unresolved questions I have and general thoughts:
continuous
tasks per-say. It's a little hard to reason about the idea of a task just repeatedly executing until the state is executed, unless, I'm misunderstanding something. In your producers/consumers spec, does returning Success
indicate to some parent actor/executor that the task should be complete, causing it to trigger, say a stop()
transition? Meaning, if you don't wrap in Success
, you're effectively just "producing" values (continuously) until you see Success
?transitions
is a very novel approach, and I like the idea of modeling everything after state and transitions (after all, Hewitt and others mention actors as just having simple local state usually, and if everything has state, using an actual state machine abstraction seems to fit really nicely). That being said, grokking the transitions
code, subtleties, and other things was not straightforward or trivial. Additionally, even grokking the Jumpstarter code and tests, as barebones and simple as they are, took a bit of time in terms of wrapping my head around what context managers were being entered (and exited when), which transitions were triggering which things, etc. My concern here is that if/when a public API is introduced, I think, unless someone wants to heavily customize it, it would be important to keep things simple and provide some guarantees.transitions
queued=True
mode makes sense, as you've used in in the draft. That being said, the way that it hides exceptions that can occur, especially in a system where we're relying on context managers to manage certain sections of code and be entered and exited at proper times, makes me a little nervous. Additionally, I feel like some of the API and boundaries of transitions
, especially when it gets to the hierarchical state machines and parallel state machines, are rough around the edges and could lead to some hard to debug edge cases and error handling, especially around timeouts, proper context manager cleanup and resource handling, and other things like that. The API is still at a state where I feel like it is hard to understand and use properly, and I would like Jumpstarter to be something that is easy to get started with (like trio
for example took a complex thing like async/await
and made it very simple to use and understand, while also exposing more complex APIs that you could reach for when needed). I like the extension flexibility, but the complexity does seem quite high at the moment.I may add some other questions and comments if I think of more, but I'd like to have a discussion around these things at least.
At the time of writing (2021-05-09), very WIP initial draft. I only filled out an abstract which I think is probably too long, but I wanted to get my thoughts out there and hopefully at least start discussing things.
The motivation I also started to fill out. I'm not entirely sure where I want to go with that exactly yet, but wanted to get things going. I think it's important to me to show how the actor model and actor system can be a really great foundation for next-gen Celery, and also clearly explain why it's a win vs. other trade-offs.
I'm not familiar enough, at this point, with the internal details of a lot of the Celery components enough, in my opinion, to make authoritative claims about certain things, etc., but I do really think that composable tasks and messages via an actor system is the way to go, and I want to argue that case well, etc.