eternnoir / pyTelegramBotAPI

Python Telegram bot api.
GNU General Public License v2.0
8.11k stars 2.03k forks source link

Problem when using `bot.retrieve_data` #2245

Closed YouKnow-sys closed 5 months ago

YouKnow-sys commented 6 months ago

Please answer these questions before submitting your issue. Thanks!

  1. What version of pyTelegramBotAPI are you using? 4.16.1

  2. What OS are you using? Linux and Windows

  3. What version of python are you using? 3.12

there is a problem with using bot.retrieve_data that can some times lead to data loss, the problem can appear when we do some other async work inside a async block using bot.retrieve_data (awaiting a bot.send_message for example). this can cause us to recieve another update in our handler that use bot.retrieve_data so in end we can potentially lose one of the data that we modified. here is a fully working example of this problem happeningg

import asyncio
from telebot import async_telebot
import logging
from telebot import types

async_telebot.logger.setLevel(logging.INFO)

bot = async_telebot.AsyncTeleBot(token="TOKEN")

class Single:
    def __init__(self, message: types.Message) -> None:
        self.message = message

class Group:
    def __init__(self, message: types.Message) -> None:
        self.messages = [message]
        self.media_group_id = message.media_group_id

    def add_message(self, message: types.Message):
        self.messages.append(message)

@bot.message_handler(commands=['start'])
async def on_start(message: types.Message):
    await bot.reply_to(
        message,
        "Hi",
    )
    await bot.set_state(message.from_user.id, 1, message.chat.id)
    await bot.add_data(message.from_user.id, message.chat.id, posts=[])

@bot.message_handler(chat_types=['private'], content_types=["audio", "photo", "voice", "video", "document", "text"],)
async def on_posts(message: types.Message):
    async with bot.retrieve_data(message.from_user.id, message.chat.id) as data:
        posts: list[Single | Group] = data['posts']

        if message.media_group_id:
            if posts:
                for idx in reversed(range(len(posts))):
                    if isinstance(posts[idx], Group) and posts[idx].media_group_id == message.media_group_id: # type: ignore
                        posts[idx].add_message(message) # type: ignore
                        print(f"Added as part of {idx} group")
                        return
            posts.append(Group(message))
            print(f"Added new group at {len(posts)}")
        else:
            posts.append(Single(message))
            print(f"Added new single at {len(posts)}")

        await bot.reply_to(
            message,
            "Msg added to posts."
        )

asyncio.run(bot.infinity_polling())

if you run this example and send a few media gallery to the bot (remember to send /start at first) we can see that some of them fully get lost in process, and thats all because the await bot.reply_to that we are doing.

possible way to fix

coder2020official commented 6 months ago

NEVER send messages inside that function

YouKnow-sys commented 6 months ago

NEVER send messages inside that function

I know I shouldn't do it, but it was just an example, it's still possible that we need to call another async function in order to modify the data or doing something else, that's why that I think this function should be usable in this scenario as well

BlocksDevPro commented 6 months ago

I found the issue.

The problem with the current bot.retrieve_data is, when you get multiple updates it wont push the data on each processd_update, it wait till all the updates are processed and then just updates on the last one.

Explaination in code.

@bot.message_handler(content_types=['photo', 'video'])
async def on_post(message: types.Message):
    async with bot.retrieve_data(message.from_user.id, message.chat.id) as data:
        # on media group, the data only gets updated on the last update: Media of the group.
        # so whatever you update on the group, it will only be updated if this is the last update: Media of the group.

        # if we add 1 to the list everytime, it will be only update one time and on the last update.
        # if we get 3 media in a group, what we expect the data['posts'] to be is [1, 1, 1]
        data['posts'].append(1)
        # but we get the data['posts'] as [1].

Solution

@bot.message_handler(content_types=['photo', 'video'])
async def on_post(message: types.Message):
    async with bot.retrieve_data(message.from_user.id, message.chat.id) as data:

        data['posts'].append(1)

        # so the solution is very simple just update the data manually
        await bot.add_data(message.from_user.id, message.chat.id, posts=data['posts'])

        # now when user submits 3 media in a group, we get the data['posts'] as [1, 1, 1]
coder2020official commented 6 months ago

Use retrieve_data to quickly get data or alter it, or even add. But never use any API requests inside, that will solve the issue. Calling a function to add data inside the retrieve_data is ridiculous.