grammyjs / runner

Scale bots that use long polling to receive updates.
https://grammy.dev/plugins/runner
MIT License
18 stars 2 forks source link

Can anybody please explain to me why the runner is incompatible with running the bot with a webhook? #16

Closed Evertt closed 1 year ago

Evertt commented 1 year ago

Like, I've been reading through the code and I can clearly see that the code is written with only the longPolling method in mind.

However, I don't understand why it couldn't also be made compatible with the webhook way?

The reason I would really appreciate this, is because currently my telegram bot (which is written with Telegraf, but doesn't matter) has a problem. Namely it sometimes takes a long time to fully process a new update. And as long as it's still processing that one update, all new updates are on hold.

Before I looked at grammY's runner, I tried fixing it myself. I tried to use something like const queues = new Map<ChatId, Queue>(), to make a queue for each chat conversation, and I was hoping that that would at least parallelize updates from different chats. (A queue simply holds on to an array of promises that it executes sequentially.) However, that doesn't work. When an update in 1 chat takes a long time, then it still can't concurrently process an update in another chat.

So, I've been looking at grammY's runner and it's clear to me that that is the solution. To create separate workers to process the updates from separate chats.

I just really don't understand why you guys did not make it work for webhooks too. Cause then it would've been so easy to use the runner even for Telegraf, cause I would just need to write a small adapter. Now I need to basically write my own runner, which sucks a bit. 😅

KnorpelSenf commented 1 year ago
  1. By default, using grammY, webhooks are processed sequentially per chat and concurrently across chats. I believe that telegraf can do the same, but if that's not the behaviour you're observing, then there's a bug in telegraf or some of the middleware you're installing.
  2. If you run your bot on webhooks, you must make sure that all your handlers complete fast. If you have long-running tasks, you must delegate them to a worker queue. You are free to use the underlying queue implementation of the runner (no matter the deployment type), simply import DecayingDeque from the runner. However, most likely, if you want to use a task queue you have more requirements such as retries or persistency, and such things are out of scope for the runner package. That's why we don't advertise to use the runner's queue.

You may be interested in reading https://grammy.dev/guide/deployment-types.html#ending-webhook-requests-in-time

Evertt commented 1 year ago

Yeah so in my first attempt, I had created a queueMiddleware that would add any new update to the appropriate queue and then the middleware would return immediately. Basically immediately ending the webhook request.

But then still, the promises are not run in parallel. And I just asked ChatGPT and it told me that that's because:

Even though you are using Promises to offload work from the main thread, all of the Promises are still executed within the same event loop, which means that they run sequentially.

So that to me sounds like it's not a bug in Telegraf, but just a consequence of how the event loop that both Deno and Node use. Do you agree? So that's why I believe I need to create workers. What do you think?

KnorpelSenf commented 1 year ago

Do you agree?

No, this is complete nonsense, it's not at all related to this topic.

I just asked ChatGPT

This is a bigger problem than struggling with concurrency.

KnorpelSenf commented 1 year ago

What number of connections do you permit from Telegram?

I had created a queueMiddleware that would add any new update to the appropriate queue

Which task queue package did you use?

Evertt commented 1 year ago

No, this is complete nonsense, it's not at all related to this topic.

Really? It made sense to me... I must not be understanding the event loop...

Which task queue package did you use?

https://deno.land/x/queue@1.2.0/mod.ts

KnorpelSenf commented 1 year ago

Really? It made sense to me... I must not be understanding the event loop...

It's not related to the event loop.

Evertt commented 1 year ago

So, my original queue middleware looked a little something like this:

type ChatId = Chat["id"] // number
const queues = new Map<ChatId, Queue>()

export const queueMiddleware = <C extends Context = Context>(update: C, next: ((ctx?: C) => Promise<any>)) => {
  if (!update.chat) return next()

  if (!queues.has(update.chat.id)) {
    queues.set(update.chat.id, new Queue())
  }

  const chatQueue = queues.get(update.chat.id)!
  chatQueue.push(next, update)
  log("return immediately")
}

As you can see, I return immediately from the middleware, so the webhook is immediately finished. and the push() method just creates a new promise that is added to an array and they are handled sequentially. But since I'm creating a new queue for every chat.id, I thought that separate chats should not block each other. But they do...

What would your explanation for that be?

Evertt commented 1 year ago

@KnorpelSenf could we maybe talk a bit on Telegram? Would you be willing to try to help me figure out why one long-taking promise seems to be blocking the other promises too?

KnorpelSenf commented 1 year ago

My explanation for this would be that either you're measuring incorrectly what happens concurrently or sequentially, or that there are some quirks in your real code which you didn't include in your pseudo.

I'd be happy to welcome you in https://grammyjs.t.me and have a chat about how exactly webhooks work, why your code isn't working optimally, and where to take this package.

Evertt commented 1 year ago

@KnorpelSenf okay I just ran a little experiment to understand better how promises work. And from what I can see, if there is synchronous code inside a promise, it will block other promises from running.

I wrote the code in typescript playground: https://tsplay.dev/w8Xl9w

And I ran the output javascript in the console of the developer tools on an about:blank page.

And what I saw, was that all promises were started in parallel, but then the first promise that contained synchronous code blocked all the other promises from running, until it was finished.

Sooooo, what I gather from that, is that promises do not guarantee that the code inside it will not block other promises. I need to be absolutely sure that there is no long-running synchronous code inside of a promise to ensure that other promises won't be blocked.

KnorpelSenf commented 1 year ago

JavaScript has no threads. It doesn't matter if you use promises or not. If you block the CPU, everything else won't execute. You must never perform CPU-bound work in JS.

This is in no way related to queues.

Evertt commented 1 year ago

@KnorpelSenf okay, but would / could this be mitigated when replacing these promises with Workers? Those create / run on new threads, right?

KnorpelSenf commented 1 year ago

This is an https://xyproblem.info. What are you even trying to do?

Evertt commented 1 year ago

I have one command handler, which uses an smtp client to send an email to the user.

For some reason that I don't fully understand yet, the send method on the smtp client sometimes just takes forever. Even though it's a method that returns promise, which I thought would not be a problem.

But it turns out, it is a problem, because when that send method is hanging, my telegram bot won't process any other updates anymore that are on any of my queues.

So, I don't know, I'm guessing the smtp client runs synchronous code and I need to find a new smtp client package that does not run synchronous code maybe? Or I need to use a different way to send an email?

KnorpelSenf commented 1 year ago

It would be extremely strange for an SMTP library to do busy waiting. In that case, you shouldn't fix it by using workers, you should fix your client. Try running setInterval so you can check the health of the event loop.

KnorpelSenf commented 1 year ago

Feel free to reopen if you have further questions.