Question on usecase - Githubissues

sehmaschine commented 3 years ago

We have different kinds of background tasks with our appliations:

Background tasks which are independent of a user action, e.g. sending daily mails or doing backups. These tasks are usually part of a distinct application/microservice.
Long running jobs which have to run in the background, e.g. generating PDFs. These tasks are part of the main app and are triggered by a user action (e.g. clicking a button). Then the user either stays on the page and needs updates on the process or the user leaves the page and we will inform her about the process later on. With this usecase, we need to update a single job with the current state (e.g. 23 of 80 PDFs created) regularly.

My question is if repeater also covers usecase 2 or if it is mainly developed to solve usecase 1. When looking at the docs, it seems that both usecases are currently covered. I'm just curious about the general direction/philosophy of repeater. Because we would like to stick with the chosen solution for a while ...

cannikin commented 3 years ago

Hello! All Repeater does is make an HTTP(S) call to a URL at a given point in time and then record the response. It can make that call once, or repeatedly (ha!) at a set interval.

It's also tailored for the Jamstack where you'll generally expose your app's functionality as API calls somewhere. In Jamstack land this has generally been through AWS Lambda functions. When you deploy to lambda you get a URL that you can access to invoke your function.

So in usecase (1) you might have the code that sends an email, or starts a backup, exposed at some endpoint. Repeater would then call that endpoint at the specified time. You can then later query Repeater and ask for the response of that call (or ignore it completely for non-critical jobs).

Now, depending on your email service and its own API, you may be able to have Repeater call the third party service directly and skip your app entirely—SendGrid will send an email by just POSTing to an endpoint. But personally, I'd want to keep the code for sending the email within my app and just expose the "send an email now" functionality as an API call, and just have Repeater call that endpoint for me. That call can contain security information to make sure that bad actors can't find the endpoint and send billions of emails a second (you can set query params and headers in the Repeater call).

Usecase (2) is totally doable, but you may need some creative architecture. Here's two ways I can think of to do what you're looking for:

Have the browser itself make a call to Repeater to start the job, right now. The call to start the job will return right away, and the name you gave the job will be the unique ID you can use to query the state of the job later. So you can start a periodic AJAX request that asks Repeater "are you done yet?" over and over until it's done. You can then update the UI for the user.
If you'd rather rely on your own code to know when the job is done, have Repeater start the job immediately, but the process that is running (the API endpoint that Repeater called) will set a flag in your own database indicating that it's done. You then have the browser either periodically ask another endpoint if that database flag is set, or do something with pub/sub and websockets...however complex you want to make that check.

But remember: Repeater itself doesn't perform any logic or execute any packaged up code for you—all it does is call a URL and record the result. Whatever URL it calls will be where the magic actually happens.

Does that help?

sehmaschine commented 3 years ago

Thanks for the explanation. I'm coming from Celery/RabbitMQ and would like to find a more lightweight solution for usecase 2.

What I had in mind is this:

User clicks a button "Create PDFs".
This will call an API endpoint within my app.
The backend script behind the endpoint will create a job with Repeater.
The job URL (another API endpoint) will start generating the PDFs.
With every PDF which has been processed, I'm updating the job.
With the browser/frontend, I'm calling a 3rd endpoint within my app which just gets the job result.

Does that sound reasonable?

Additional question: I'm not exactly sure about the database flag in order to indicate that the job has finished. I would like to avoid adding flags for the state of every background process with my DB. Do you really think that´s necessary given the workflow described above? Isn't it exactly the point of using something like Repeater in order to check for finished jobs? Or do I miss something here?

cannikin commented 3 years ago

Yep that flow makes sense to me!

The benefit of the database flag is that you can (probably) query your own servers and get a response faster than the query to Repeater. And any server-side code can access that flag directly, instead of through a GraphQL call out to Repeater. But we're talking like a 50-100ms difference, which may not matter at all for what you're doing.

There's another solution out there if you find that Repeater isn't exactly what you're looking for: https://quirrel.dev/ It sort of abstracts away the fact that the processing is happening on another system altogether, and it looks like it'll actually process arbitrary JS code, not just call a URL. I'm not sure from the code samples if you can check on the status of jobs at all, or rerun/change them in any way, but you may not need that functionality. It also doesn't provide any kind of UI that I can see, whereas Repeater has a whole dashboard for monitoring jobs/results.

sehmaschine commented 3 years ago

Thanks. I'm fine with just calling a URL, so Repeater will probably work for our usecase. Feel free to close this issue.

redwoodjs / repeaterdev-js

Question on usecase #6