eilvelia / tdl

Node.js bindings to TDLib 🥒
MIT License
411 stars 53 forks source link

non blocking event loop iterUpdates #172

Closed Jhon-Mosk closed 1 month ago

Jhon-Mosk commented 2 months ago

Your implementation is blocking the event loop. For example, if you run several clients in the same process and use iterUpdates in one of them, then the second client will not be able to perform any operations.

I have prepared code examples to explain what I mean.

Blocking

'use strict';

const range = {
  start: 1,
  end: 1000,
  [Symbol.asyncIterator]() {
    let value = this.start;
    return {
      next: () => Promise.resolve({
        value,
        done: value++ === this.end + 1
      })
    };
  }
};

// it will be output only after the end of the iterator
setTimeout(() => {
  console.log('setTimeout 0');
}, 0);

(async () => {
  for await (const number of range) {
    console.log(number);
  }
})();

Non-blocking

'use strict';

const range = {
  start: 1,
  end: 1000,
  [Symbol.asyncIterator]() {
    let value = this.start;
    return {
      next: () => new Promise(resolve => {
        setTimeout(() => {
          resolve({
            value,
            done: value++ === this.end + 1
          });
        }, 0);
      })
    };
  }
};

let k = 0;

// it will be output because the asynchronous iterator will return the tick to the event loop
const timer = setInterval(() => {
  console.log('next ', k++);
}, 10);

(async () => {
  for await (const number of range) {
    console.log(number);
    if (number === range.end) {
      clearInterval(timer);
    }
  }
})();

My solution allows you to give a tick after scrolling the iterator. I also use one common promise, which saves resources and increases productivity.

eilvelia commented 2 months ago

What's the issue with processing updates immediately if they're available? setTimeout will be executed when there are no updates available anymore. Your implementation adds a significant artificial delay after every update. I.e., what's the real-world issue you're trying to solve?

If there's a strong need, it's still possible to execute something in the microtask queue without waiting for the next macrotask, e.g. by using queueMicrotask. For example, replacing the setTimeout/setInterval block with the following code prints queueMicrotask between every number.

let i = 0
queueMicrotask(function f () {
  console.log('queueMicrotask')
  if (++i > 1000) return
  queueMicrotask(f)
});

And since you can control how next is called, it is also possible to process updates only in macrotasks without making any modifications to the tdl's code. The following does basically the same as setTimeout inside next that you added:

const sleep = delay => new Promise(resolve => setTimeout(resolve, delay));
(async () => {
  for await (const number of range) {
    console.log(number);
    await sleep(0);
  }
})();

But perhaps this chain of the update microtasks can indeed get quite long and block everything else for a long time. If that is the case, I think the possible solution might be to postpone update processing until the next macrotask after a certain number of microtasks. E.g., process 100 updates in the microtask queue (counting them), and then add the next update to the macrotask queue (probably using setImmediate), repeat. The solution proposed in the PR certainly isn't suitable since it will add a 1–4ms delay between the updates, making writing high-throughput bots impossible. I also think that client.on('update', ...) should have the same issue, so next probably isn't the right place for the changes.

Are you sure the blocking is because of the next calls themselves, and not because of the code you execute inside the for..await blocks? Since those are executed in microtasks, you shouldn't do any long actions there (you can use setImmediate/setTimeout inside though).

edit: By the way, you really should use setImmediate(...) instead of setTimeout(..., 0) if you want to do that. This is a lot faster, with setTimeout the runtime will just be sleeping for nothing.

Jhon-Mosk commented 2 months ago

I have such a case. There is an application in which you can run n tdl clients. iterUpdate can be launched in each client and there is a chance that it will happen at the same time. iterUpdate is used to track the delivery of the message and pin it to the group.

eilvelia commented 2 months ago

What do you run inside the for..await block?

edit: For the record, setImmediate works correctly in node and bun, but not in deno, where it is much slower than queueMicrotask

Jhon-Mosk commented 2 months ago

I'm sending a message to the group. Using iterUpdat, in the for await section, I catch a hook with a successful or unsuccessful sending by the message ID. And I make a request to pin this message in the group.

eilvelia commented 2 months ago

Do you have the same issue if you let the for.await blocks be near-empty (e.g., containing only a console log)?

Jhon-Mosk commented 2 months ago

I figured out why there were delays. Enabled verbosityLevel: 2. I saw errors: [ 2][t 4][1725900220.395737409][NetQueryDelayer.cpp:83][#1][!NetQueryDelayer] Delay: [Query:[id:15204352][tl:0xe470bcfd]] [timeout:27][total_timeout:27] because of [Error : 420 : FLOOD_WAIT_27] from Session:2:main::Connect::TCP::[145.154.137.41:443] to DcId{2} from [152.148.0.154:35412].

Thank you for your time. Could you tell me what these errors might be related to, given that I don't make any requests? Is this due to the fact that there are more than 300 groups in the account and a lot of messages are coming to each one?

eilvelia commented 1 month ago

300 groups should be fine. Perhaps that is caused by too frequent connects. IIRC I've received something similar if I created too many clients in a short amount of time.