tdlib / td

Cross-platform library for building Telegram clients
https://core.telegram.org/tdlib
Boost Software License 1.0
7.11k stars 1.44k forks source link

`GetChatHistory` in a few threads #2582

Closed akulik512 closed 1 year ago

akulik512 commented 1 year ago

Hello guys I use getChatHistory function to get messages from a channel. This function has a limitation of no more than 100 messages per request and if we get chat history from a big channel (1000 or 10000 messages) so the process takes a lot of time to read the whole channel. I made a test and received 1448 messages in 8 minutes. I thought about how can I speed up the process and figured out that probably we can do it in a few threads. I can use from_messageid argument to define a specific message where should we start. Let me show an example: We have a chat with 1000 messages. Thread no1 = will consume messages from 0-100, 300-400, 500-600, and 800-900. Thread no2 = will consume messages from 200-300, 400-500, 600-700, and 900-1000.

In theory, it can speed up consumption two times but I'm afraid there is a chance to get a ban.

levlam commented 1 year ago

What do you mean?

akulik512 commented 1 year ago

Hello @levlam I accidentally posted a question without a description. Fixed

levlam commented 1 year ago

No, you can't do this. You can't send the next request before receiving response to the previous one, because you don't know the next from_message_id.

akulik512 commented 1 year ago

Before GetChatHistory I send SearchPublicChat which has the last message published to the channel and containsmessageId (something like this). This id will be equal to the number of all messages in the channel. So if the first message's batch has 100 items then the next messageId will be 200. I think we can calculate the next from_message_id.

But now I see another problem it seems like I can't stop consuming after getting 100 messages or only get 100 messages. I just can point out where should we start (from_message_id) but it will return all other channel messages and can't be stopped.

akulik512 commented 1 year ago

Oh no, I can stop the process, I see it, sorry :smile: Ok, can we move on to the initial question? Will we get a ban if we send a few GetChatHistory requests at the same time using one session?

levlam commented 1 year ago

This id will be equal to the number of all messages in the channel.

It is not, it is just an identifier. There is no way to "calculate" the next from_message_id in advance.

And yes, the account can be banned for Telegram API abuse.