czue / celery-progress

Drop in, configurable, dependency-free progress bars for your Django/Celery applications.
MIT License
472 stars 90 forks source link

Websocket Optimization #51

Closed EJH2 closed 4 years ago

EJH2 commented 4 years ago

This pull request includes several changes:

Any reviews would be greatly appreciated.

OmarWKH commented 4 years ago

I also like this change but I have two concerns:

  1. Previously the message sent to the client was generated by Progress.get_info. Now it's spread across Progress (for HTTP) and WebSocketProgressRecorder + task_postrun_handler (for WS). This makes it harder to understand and maintain. Maybe Progress.get_info could have more parameters? Or instead subclass AsyncResult and pass Progress.get_info a real or fabricated AsyncResult? Or..?
  2. Less important, but getting AsyncResult each time is a more defensive approach in case update_progress does not work as intended. Although I'm not sure being so defensive is warranted.
EJH2 commented 4 years ago

Omar, your concerns are certainly warranted, but the reason for this change is this:

  1. From what I can see in the source for AsyncResult, in order to fetch even just the state of the task, it has to hit the result backend at least once, if not more. This is perfectly fine for the HTTP progress bar, as the intervals between requests make this negligent (unless you set your request interval to an unreasonably small number, in which case you're just DDoSing yourself). However, one of the marketed benefits of the WS bar is that you get updates as soon as they happen. The need for this change stemmed from my own experience, where I improved my task so much that I would trigger a race condition for trying to fetch the data and send it before I needed to fetch it again. All of the data provided by getting the Progress info is supplied by both meta and the post-run handler kwargs.
  2. Not sure what you're referring to with update_progress, but see above.
OmarWKH commented 4 years ago

That makes sense.

  1. [..] All of the data provided by getting the Progress info is supplied by both meta and the post-run handler kwargs.

My proposal is to separate data collection from message generation.

Data collection: In the case of HTTP the data would be collected by AsyncResult. And in WS, by meta and kwargs.

Message generation: Both HTTP and WS would pass the data to a common function that generates the message to send to the client (like Progress.get_info does now).

  1. Not sure what you're referring to with update_progress, but see above.

Sorry I meant task.update_state.

EJH2 commented 4 years ago

Message generation: Both HTTP and WS would pass the data to a common function that generates the message to send to the client (like Progress.get_info does now).

I would be unopposed to a potential implementation if it makes developing and maintaining this easier for us.

Less important, but getting AsyncResult each time is a more defensive approach in case [task.update_state] does not work as intended. Although I'm not sure being so defensive is warranted.

From what I can tell at this point, if task.update_state is not working as intended, then there's a bigger problem afoot. task.update_state's sole purpose is to store the updated progress data so that Progress(task_id).get_info() can fetch it later, where AsyncResult is responsible for doing the fetching. One cannot reliably work without the other, so I don't see how we could possibly redesign that, unless there's a better way to store and later fetch the metadata.

OmarWKH commented 4 years ago

I would be unopposed to a potential implementation if it makes developing and maintaining this easier for us.

I'll try to come up with something tomorrow.

EJH2 commented 4 years ago

I've gone ahead and changed the behavior of the error down to a logger warning inside of WebSocketProgressRecorder, and waiting for @czue's comments on the other topics.

As an aside, would it be possible to add hacktoberfest to the repo topic?

czue commented 4 years ago

As an aside, would it be possible to add hacktoberfest to the repo topic?

Sure. How do I do this, exactly? Or are you able to?

EJH2 commented 4 years ago

Sure. How do I do this, exactly? Or are you able to?

This can be done by going to the repo's home page and clicking on the gear next to the About section here: image and adding the hacktoberfest label to the topics box here: image I believe only the owner can edit this, as I don't seem to have the ability myself. These screenshots were taken from my own repo.

czue commented 4 years ago

thanks - done!

EJH2 commented 4 years ago

@OmarWKH If you're ok with this, I'll go ahead and merge it in which would allow you to continue development on your PR.