Together-Java / TJ-Bot

TJ-Bot is a Discord Bot used on the Together Java server. It is maintained by the community, anyone can contribute.
https://togetherjava.org
GNU General Public License v3.0
100 stars 90 forks source link

Block scam links #390

Closed marko-radosavljevic closed 2 years ago

marko-radosavljevic commented 2 years ago

Is your feature request related to a problem? Please describe.

There is too much scam links (nitro, steam, etc) posted on the discord, and it's taking a lot of moderation time, while annoying and harassing the whole server. Sadly, people still fall for these phishing attempts, so it's also a security issue we should address.

Describe the solution you'd like

Block the scam links with our bot, and possibly temporarily remove offending members. (if false-positives are rare enough)

Describe alternatives you've considered

To waste time indefinitely until discord does something about it.

Additional context

Phishing link can look like this (list will be edited):

https://disceord.gift/EF2Zx27xxxx

On the same IP, these domains can be found:

How it looks in chat: discord-nitro-scam-example

Text phrases usually found in link previews:

You've been gifted a subscription! Discord has gifted you Nitro for 1 month

I can host a full website example, or just html page if needed.

Zabuzard commented 2 years ago

Nitro scam from the last 6 months.

DO NOT CLICK ANY OF THESE LINKS!

(they will steal your discord account)

🤩bro steam gived nitro - https://nitro-ds.online/LfgUfMzqYw

@​everyone, Free subscription for 3 months DISCORD NITRO - https://e-giftpremium.com/

@​everyone
Discord Nitro distribution from STEAM.
Get 3 month of Discord Nitro. Offer ends January 28, 2022 at 11am EDT. Customize your profile, share your screen in HD, update your emoji and more!
https://dlscrod-game.ru/promotion

@​everyone
Gifts for the new year, nitro for 3 months: https://discofdapp.com/newyears

@​everyone yo , I got some nitro left over here https://steelsseriesnitros.com/billing/promotions/vh98rpaEJZnha5x37agpmOz2

@​everyone
:video_game: • Get Discord Nitro for Free from Steam Store
Free 3 months Discord Nitro
:clock630: • Personalize your profile, screen share in HD, upgrade your emojis, and more.
:gem: • Click to get Nitro: https://discoord-nittro.com/welcome
:Works only with prime go or rust or pubg

@​everyone, Check this lol, there nitro is handed out for free, take it until everything is sorted out https://dicsord-present.ru/airdrop

@​everyone
• Get Discord Nitro for Free from Steam Store
Free 3 months Discord Nitro
• The offer is valid until at 6:00PM on November 30, 2021. Personalize your profile, screen share in HD, upgrade your emojis, and more.
• Click to get Nitro: https://dliscord.shop/welcome

airdrop discord nitro by steam, take it https://bit.ly/30RzoKw

Steam is giving away free discord nitro, have time to pick up at my link https://bit.ly/3nlzmUi before the action is over.

@​everyone, take nitro faster, it's already running out
https://discordu.gift/u1CHEX2sjpDuR3T6
Zabuzard commented 2 years ago

Patterns I see are:

Tais993 commented 2 years ago

Owh my gosh, thanks for the Nitro @Zabuzard! I can't login into my account anymore though?

1 idea of the implementation.

Our bot deletes the message, mutes the author, and sends the content in a mod-log channel. This message contains a ban + unmute button, this way we can never really ban the wrong person.

After banning/muting we can just disable the buttons so everyone knows it's already been handled.

This makes it nearly impossible, even when our bot fails, to ban the wrong people.

Zabuzard commented 2 years ago

Good idea.

I am still hoping to fix this proper with either Discord help or an actual proper scam-blocker-bot.

But a low hanging fruit seems to be to build some simple detection out of the above pattern and call it a day.

In particular a combination of an @everyone, nitro and a url is a dead giveaway and already covers 72% of the scam. And nitro with a url that is bit.ly, contains text similar to discord, nitro, premium (but is not discord.com) covers the rest.

marko-radosavljevic commented 2 years ago

Yup, I agree on the flow Tais993 discussed. The bot should handle it, post in audit log, and then we can monitor it and revert the action quickly, if needed.

Nitro scam from the last 6 months.

All the links I clicked are no longer valid. You can't even get scammed properly these days. :(

Another probable pattern that is very consistent and predictable is phishing HTML they use. It's basically the same discord page with login, and there are characteristic elements like logo, login fields, text about nitro gifts, styling… etc.

If we analysed a few of them, I'm sure we would find at least 5 elements that are identical, that we can use as a fingerprint. Curl the domain, check the raw HTML of the page, grep/jsoup fingerprint and base evaluation on that.

Zabuzard commented 2 years ago

Another probable pattern that is very consistent and predictable is phishing HTML they use.

Do you actually have to enter data first? I thought it steals the session directly, i.e. just clicking it is already too much.

If thats the case, it will be hard for us to analyze the websites 😆

marko-radosavljevic commented 2 years ago

Another probable pattern that is very consistent and predictable is phishing HTML they use. Do you actually have to enter data first? I thought it steals the session directly, i.e. just clicking it is already too much.

If thats the case, it will be hard for us to analyze the websites laughing

I can do it safely, just find me a few recent links that are alive. ^^

I have the last one saved, can share the HTML, or host the whole website for analysis.

marko-radosavljevic commented 2 years ago

Full html of the last scam, unsanitized. (no js or css)

discord-nitro-scam-phishing-website-default-theme-2 (copy).txt

Images that go with it: (click) 439112b388adcac969dc066d30767b76 ![zas](https://user-images.githubusercontent.com/30927961/154945893-5ceac0cd-efcf-446e-bdea-98813efc0e76.jpg)
How it looks with css: (click) ![image](https://user-images.githubusercontent.com/30927961/154949759-75dbde67-c08f-47d9-893d-61b9b8d26908.png)
Tais993 commented 2 years ago

Another probable pattern that is very consistent and predictable is phishing HTML they use.

Do you actually have to enter data first? I thought it steals the session directly, i.e. just clicking it is already too much.

If thats the case, it will be hard for us to analyze the websites 😆

It can't, session is stored in the app / your browser. And a website can't access your PC like that, that'd be an extreme danger, and would mean that every site can hack you.

there's 2 types of scams;

Nitro scams, you've to login on a website, they gain FULL access to your account, can't change account details. App scams, you need to "try" their "game", gains them access to the token / session, can't change account details

marko-radosavljevic commented 2 years ago

I can write an HTML analyser/fingerprinter later on. I just need a few more scam links to establish a pattern, since this is the only data I have for now. :relaxed:

Heatmanofurioso commented 2 years ago

Why not try to make a pattern matching domains similar to Discord? I know they're adding new domains all the time, but.. some sort of fuzzy search on domains similar to the real one "excluding it" would probably catch most of them easily. "domain containing the words 'https://disc', 'https://nitro' 'https://gg'

Edit: I meant this, plus the keywords Zabu mentioned. Anything matching this would get immediately removed. Anything matching those keywords, would go to moderation review

Tais993 commented 2 years ago

And what's the issue with basing it off the message? They keywords Zabuzard mentioned, a link that's almost the same, would be enough?

Do we really need a HTML analyzer?

marko-radosavljevic commented 2 years ago

Yup, that's the idea for now.

Use the link/text for detection, and later on we can improve it. HTML fingerprinting is a relatively easy next step we can take, if we are not satisfied with the results.

First step we wanted to take is some basic heuristic based on the link (levenshtein distance, or similar) and text that goes with it, since the pattern is very similar every time. :relaxed:

We also have a pretty decent amount of scam urls we can use to 'train' our bot, and test against.

Some of the known discord scam URLs:

And when the discord auto-mod features gets released, hopefully it will be good enough that we don't need to do any extra work. ^^

Zabuzard commented 2 years ago

While ur discussing, I already created a matcher that I think works fine enough.

It just does what I mentioned earlier. Simple but good enough, I suppose:

private static boolean isScam(@NotNull AnalyseResults results) {
    if (results.pingsEveryone && results.containsNitroKeyword && results.hasUrl) {
        return true;
    }
    return results.containsNitroKeyword && results.hasSuspiciousUrl;
}
Zabuzard commented 2 years ago

That said, I would consider checking the pishing website superior. It would also be quite simple with Javas HttpClient and a few simple contains checks. But we lack test data.

marko-radosavljevic commented 2 years ago

Yup, I believe HTML analyser would be the most accurate method, and valuable addition. It can also be made very extensible, where you can easily add new phishing sites. It can also be self-improving, expanding its fingerprint base and accuracy.

There are lots of papers with techniques on how to do this very effectively and efficently:

There are services that do this, and provide an API we can use:

But like you said, with HttpClient and basic contains or jsoup for parsing and finding elements we want, we can write a simple but effective tool. Create the fingerprint, and store it for future comparisons. Very fast, accurate and not resource intensive.

Would like to give this a try, when I get hands on a few more websites. And possibly original discord page everyone uses as base. ^^

I assume just url/text checking should be enough for 95% of the cases, tho. So this is something I would like to add later on, after zabu implements url/message detecion. ^^


While ur discussing, I already created a matcher that I think works fine enough.

How does it perform against these @Zabuzard?

Zabuzard commented 2 years ago

I really do not want to overcomplicate this. At least not for the first iteration. I believe that what I created now is good enough for our purpose.

config-wise it supports a host-blacklist, a host-whitelist and a list of suspicious words. if suspicious words appear in the host name (fuzzy matching), it is marked suspicious.

For example check-out-disc0rd-nice.com would be considered suspicious since its similar to discord. That, together with @everyone or a mention of nitro would result in an action by the bot.

If the message just mentions nitro and @everyone together, alone the pure presence of an URL leads to an action already.

Those are the two rules I setup now.

marko-radosavljevic commented 2 years ago

I agree, It should be more than enough for 95% of cases, and it's pretty simple.

We can then wait and see what discord has to offer with their auto-mod, and improve the system further if there is need for that. :heart:

Tais993 commented 2 years ago

You mean that nitro @everyone is enough to already get removed? I'd disagree, it's nothing harmful, can easily be a joke. And that means we can't do a nitro giveaway \:p

marko-radosavljevic commented 2 years ago

Eh, I would argue that the benefit is much greater than the potential harm in removing those.

We already block all discord invite links, for example. And nitro in combination with everyone ping is pretty sussy. But like with discord invite links, mods should be immune, so we can have giveaways and things like that. ^^

---- On Mon, 21 Feb 2022 16:56:51 +0100 Tais993 @.***> wrote ----

You mean that nitro @everyone is enough to already get removed? I'd disagree, it's nothing harmful, can easily be a joke. And that means we can't do a nitro giveaway :p — Reply to this email directly, https://github.com/Together-Java/TJ-Bot/issues/390#issuecomment-1047022650, or https://github.com/notifications/unsubscribe-auth/AHL6YWIM2IJFEVTOVK3OIDDU4JOEHANCNFSM5O25S6QA. You are receiving this because you were assigned.

Zabuzard commented 2 years ago

You mean that nitro @everyone is enough to already get removed? I'd disagree, it's nothing harmful, can easily be a joke. And that means we can't do a nitro giveaway :p

I said nitro + everyone + a url in the message

If thats a joke, its ur own fault. That said, im not voting for "ban" but for "put into quarantine" (a role like muted) instead.

Also, we would by default probably run in a semi automatic mode anyways. We will start with a more manual mode and see how well it works and then increase to a more automatic mode if it works well.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label, comment or add the valid label or this will be closed in 5 days.