marten-de-vries / Flask-WebSub

A Flask extension implementation of the WebSub specification
ISC License
21 stars 4 forks source link

Depend less on Celery #4

Open marten-de-vries opened 5 years ago

marten-de-vries commented 5 years ago

The hub currently schedules a separate celery job for each HTTP request that it sends out during content delivery. That does not scale, especially with a large message body. A better approach would be to store the content in the database, and keep track of the last delivery for each subscription. Then a single background task can handle the content delivery.

marten-de-vries commented 5 years ago

The current way of handling multiple hubs works, but it's a bit of a hack. It would be better to define the tasks only once. The overhead of passing in hub information with each job is probably worth it (especially after the changes above reduce the amount of required jobs).

dannmartens commented 3 years ago

I've created a proof-of-concept which removes Celery as a run-time dependency.

Instead, I've created a stub 'Celeriac' which is API-compatible with Celery, but simply uses an asyncio event loop in a separate thread.

As the main code depends heavily on the Celery API, it was a bit of an undertaking to stub out unnecessary calls and replace needed calls with an asyncio equivalent.

Of course, by doing it this way, I haven't introduced a proper abstraction facade yet: it's all still a bit "celery" (pun intended).

marten-de-vries commented 3 years ago

Sounds interesting! Being able to swap out Celery for something else altogether is a bit wider in scope than the issue discussed here, but I would love for Flask-WebSub to support that.

dannmartens commented 3 years ago

All the tests pass, but I already have butchered the pytest fixtures and simplified the tests, mainly because the TestApp code and the dependency on app _state and Celery workers are not necessary for Celeriac to function. Fortunately, I still have an old baseline with the test stubbing, from the time when I was still using the original testing code. Regarding that test stubbing, it suffices to say "pretty it ain't."

I will go through the WebSub.rocks tests, next week, to find any remaining issues. It would be lovely to see this code end up in your package, and ultimately in PyPI.

Also worth noting is that Werkzeug has changed its API, and the current baseline is not compatible with those changes.

image

dannmartens commented 3 years ago

Okay,

I have completed the WebSub.rocks Hub tests, successfully.

image

However, I had to patch the hub/tasks.py code, as it doesn't seem to build up the request headers for a notification in a clean way.

The following statement copies in a lot of headers, which do not belong in the notification request:

27:   headers = updated_content['headers']

As a result, the WebSub.rocks server refuses the notification, on account of those stray headers:

 ERR 2020-10-07 10:08:40,059 [DEBUG] urllib3.connectionpool - Starting new HTTPS connection (1): websub.rocks:443
 ERR 2020-10-07 10:08:40,606 [DEBUG] urllib3.connectionpool - https://websub.rocks:443 "POST /hub/102/sub/Rhzj6gUYqw9mcRDdTVaK HTTP/1.1" 400 173
 OUT Debug! Server replied with: code=400 response.text=
 OUT <html>
 OUT <head><title>400 Bad Request</title></head>
 OUT <body bgcolor="white">
 OUT <center><h1>400 Bad Request</h1></center>
 OUT <hr><center>nginx/1.14.2</center>
 OUT </body>
 OUT </html>
 ERR 2020-10-07 10:08:40,608 [WARNING] websub - Notification failed

I still need to stub out the Celery retry code, which I do not support yet (this became visible with the above tests).

marten-de-vries commented 3 years ago

Weird, that's never been a problem before.