StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
6.03k stars 745 forks source link

Eventlet retirement #6128

Open nzlosh opened 7 months ago

nzlosh commented 7 months ago

SUMMARY

As reported by @cognifloyd in slack: the eventlet team has announced their intention to retire the project. https://github.com/eventlet/eventlet?tab=readme-ov-file#warning This will have significant implications for the StackStorm project as the code base is architected and designed around eventlet and the synchronous programming model.

guzzijones commented 7 months ago

gevent has monkey patching.

cognifloyd commented 6 months ago

I prefer using standardized asyncio.

I consider monkey patching to be an anti-pattern as it makes debugging very difficult (was it actually monkey patched? Did something reload the code effectively undoing the monkey patch?).

Another reason to use asyncio: Once we implement https://github.com/StackStorm/community/issues/8 the chatops layer will use asyncio since OpsDroid uses asyncio.

I think the way to migrate to asyncio is to migrate one service at a time. That might mean we have a separate async-compatible st2common-compatible library, slowly migrating the dependencies of each service from the sync to the async lib.

Also, the eventlet folks are working on tools to allow asyncio and eventlet to live in the same process, thus smoothing out the migration path. Depending on how that goes, and when it completes, that might also be a big help:

As far as gevent: That will probably be a shorter leap, as support is already partially stubbed in. Any direct imports of eventlet should migrate to using this file instead (if possible): https://github.com/StackStorm/st2/blob/master/st2common/st2common/util/concurrency.py And the monkey_patch code would need to be updated to support gevent: https://github.com/StackStorm/st2/blob/master/st2common/st2common/util/monkey_patch.py

guzzijones commented 6 months ago

I also dislike monkey patching and would prefer a service by service migration to asyncio.

Robxxt commented 1 month ago

Hi there, I would like to help to do the migration. How should we start?

nzlosh commented 1 month ago

@ZoeLeah @guzzijones You've both expressed interest in working on this issue. I recommend organising a meeting between the people who have expressed interest to work on this task and formulate an execution plan. It'd be good to document the exit strategy, the required steps and the overall changes required to move off eventlets.

The question is still open about the optimal exit strategy, given the lack of resources the StackStorm project is currently suffering from, would it be a more optimal path to switch to gevent (to switch to an actively maintained asynchronous framework) and then migrate service by service to aiohub (https://github.com/eventlet/aiohub/issues/2) in the long term? Or would it better to switch straight to asyncio using the eventlet aiohub directly?

Robxxt commented 1 month ago

@nzlosh Through which platform do you suggest to have the meeting? Do you have any date in mind? It would fit me something from Monday to Friday in this time space: 8am-4pm UTC+2(Berlin time)

AndroxxTraxxon commented 3 weeks ago

Hi, I haven't seen any activity in this space, Is there a branch or fork where this work is being completed?

Robxxt commented 3 weeks ago

Hi @AndroxxTraxxon as far as I know there is not any branch. We haven't even decided in the optimal strategy to make the migration to asyncio😅

AndroxxTraxxon commented 3 weeks ago

I'm liking the strategy proposed by @guzzijones of migrating service by service over to asyncio. would we want to also jump over to an asgi service handler like hypercorn as well? It appears that gunicorn also depends on eventlet, and based on past challenges that package has faced (like being abandoned for 2+ years), it may be best to remove that dependency for one that is more actively maintained

Robxxt commented 3 weeks ago

How should we proceed with the service by service migration. Should we use eventlet_asyncio hub? Or directly re-write eventlet functionallity in it's counterpart in asyncio??

Robxxt commented 3 weeks ago

Would this work?

Screenshot 2024-08-13 at 14 17 42

If not, should I use eventlet_asyncio to execute asyncio inside eventlet or execute eventlet inside asyncio?

Also, how can I test that my changes have not broke anything??

AndroxxTraxxon commented 3 weeks ago

oh, goodness. Eventlet's direct and indirect use runs fairly deep across the platform, including in st2common -- that's where most of the work will be.

I wish it was as simple as you proposed, but that won't work.

There are fairly large architectural differences between the two libraries, and asyncio has a lot of the kinds of structures already provided like queues and pools. The way I see it, this isn't as simple as running a find-and-replace on every instance of import eventlet.

AndroxxTraxxon commented 3 weeks ago

The only places I see that we might want to keep references to Eventlet around are in the SensorContainer and maybe the action runner, just for backwards compatibility across the exchange. Someone more tenured on this project than I may have a few more thoughts.

Given the state of this project right now, the easiest way to test this kind of thing may be to just rebuild one of the container images with the code inside, exercise it, and check the logs for when it inevitably crashes.

nzlosh commented 3 weeks ago

Switching to asyncio directly is too disruptive in my opinion. It would be a smoother and simpler transition to switch to gevent. That way we get onto an actively maintained project and the have the luxury of time to slowly migrate each microservice over to a pure asyncio solution.