umputun / tg-spam

Anti-Spam bot for Telegram and anti-spam library
https://tg-spam.umputun.dev
MIT License
217 stars 39 forks source link

Add ability to handle multiple chats with a single instance of tg-spam #100

Open umputun opened 4 months ago

umputun commented 4 months ago

This was requested by @winhex during our conversation about https://github.com/umputun/tg-spam/discussions/98

The main issue with the current approach of running multiple instances of tg-spam is the need to create separate bots, set up and potentially synchronize configurations, samples, and all other data. However, if you manage multiple chats, it could be advantageous to use a single instance of tg-spam to protect all of them with shared rules, data, and history.

Implementing this change is not straightforward because it involves dynamically managing all information related to the "source chat." Instead of having a predefined group ID in the configuration, the bot will have to fetch messages from multiple groups and correctly handle the group ID for all actions. Most likely, the new "source group/chat id" should be incorporated across data storage, API, and UI levels.

By the way, it appears that a single bot can monitor multiple chats as long as the bot is a member in those chats.

coperius commented 3 months ago

Which type of separation is preferable?

  1. A separate instance is allocated for working with each chat. The bot handles receiving/sending messages. It listens to incoming messages and redirects them to the appropriate instances for processing. Settings can be set globally and overridden for individual instances. This includes super-users and admin groups.

  2. The structure remains current. Logic for processing based on message chatID is added. Super-users can be set in the format {"globalSuper", "chat1:localSuper1"}. Accordingly, global ones can moderate in all chats, local ones and chat admins - only in their own.

There are also a number of questions about behavior

In principle, I've already implemented the second option, except for the mentioned questions. Well, and tests need to be expanded

umputun commented 3 months ago

1 is like running multiple instances, but with the same tokens, which is impossible with multiple instances. To me, 2 sounds preferable, i.e. a single "event loop" on multiple chats/groups.

When banning a user in one of the chats, should they be banned in the others? Similarly for unbanning

I don't think it should be that smart, i.e. I would expect it to ban the user only on the chat it sent spam to.

Can super-users from some chats moderate others? (division into global and local solves this problem)

I'm not even sure if we actually care about a separate set of super users. We can keep a single list at least for now. However, if you're going to implement separate lists, I don't think allowing cross-chat bans between super users should be permitted.

How to determine the original chat when processing forwarded messages? After fixing error https://github.com/umputun/tg-spam/issues/107, it will be possible to find all messages with the specified content in all chats, but which one should be deleted?

As far as I recall, TG API doesn't provide this info, but I can be wrong on this one. If such info can be retrieved, this is the solution, but if not, we can consider disabling forwarding functionality in this mode (probably should be a message in admin chat explaining it if a message was forwarded).

coperius commented 3 months ago

1 is like running multiple instances, but with the same tokens, which is impossible with multiple instances. To me, 2 sounds preferable, i.e. a single "event loop" on multiple chats/groups.

OK, the second option suits me completely as well

I don't think it should be that smart, i.e. I would expect it to ban the user only on the chat it sent spam to.

OK

I'm not even sure if we actually care about a separate set of super users. We can keep a single list at least for now. However, if you're going to implement separate lists, I don't think allowing cross-chat bans between super users should be permitted.

If we set super-users in the old format (just the name), then the current behavior will not change and they will be able to moderate all chats. Additionally, there will be an option to restrict a super-user to a specific chat (chat:username instead of just username). And accordingly, chat admins will be able to moderate only their own chats.

As far as I recall, TG API doesn't provide this info, but I can be wrong on this one. If such info can be retrieved, this is the solution, but if not, we can consider disabling forwarding functionality in this mode (probably should be a message in admin chat explaining it if a message was forwarded).

A forwarded message gives us forward_date (timestamp of the original message), forward_from.id (original userID), from.id (super-user/admin userID) and text (message content). If we replace time with the time from the processed message in the "messages" table, then it's possible to limit the set of chats by super-user, and with the other parameters, we can find the message itself with a high degree of probability.

umputun commented 3 months ago

A forwarded message gives us forward_date (timestamp of the original message), forward_from.id (original userID), from.id (super-user/admin userID)

I don't think this is the case. From what I recall all the forwarded messages have the same user id of some TG forward bot or smth like this

coperius commented 3 months ago

Now I see

{
    "ok": true,
    "result": [
        {
            "message": {
                "chat": {
                    "id": -1002242702397,
                    "title": "TestSpamAdminGroup",
                    "type": "supergroup",
                    "username": "TestSpamAdminGroup"
                },
                "date": 1722888044,
                "forward_date": 1722873154,
                "forward_from": {
                    "first_name": "Chan",
                    "id": 5748393497,
                    "is_bot": false,
                    "last_name": "Lee"
                },
                "forward_origin": {
                    "date": 1722873154,
                    "sender_user": {
                        "first_name": "Chan",
                        "id": 5748393497,
                        "is_bot": false,
                        "last_name": "Lee"
                    },
                    "type": "user"
                },
                "from": {
                    "first_name": "No",
                    "id": 182570762,
                    "is_bot": false,
                    "is_premium": true,
                    "language_code": "en",
                    "last_name": "Man",
                    "username": "copermann"
                },
                "message_id": 59,
                "text": "message"
            },
            "update_id": 674833881
        }
    ]
}

Chan Lee is the spammer and copermann is the super-user who reported this

freeseacher commented 2 months ago

Thank you for a great tool! Let me add my point of view on this issue.

When banning a user in one of the chats, should they be banned in the others? Similarly for unbanning

Nope. Actions should be taken only upon event. Spam in one chat should however increase probability to ban in other chat. Similar to cas ban idea.

Can super-users from some chats moderate others? (division into global and local solves this problem)

Nope. That would be unexpected behavior