MailScanner / v5

MailScanner v5
GNU General Public License v2.0
182 stars 58 forks source link

False positive Link Fraud with quoted .com url #621

Open Skywalker-11 opened 1 year ago

Skywalker-11 commented 1 year ago

Describe the bug Quoted links eg. "http://example.com" are falsely identified as fraud url. This seems to only affect specific TLDs (here .com). Other TLDs eg. .de or fictional 3 letter TLDs like .bla seem to not be affected. The message Found phishing fraud from http://www.example.com claiming to be www."http: in <msg-id> is logged.

EDIT: Even when the url does not contain http://www. the log message looks like this Found phishing fraud from http://bla.example.com claiming to be www."http: in <msg-id>

To Reproduce Send a mail with HTML only or HTML/plain-text mixed content which contains a link with quoted .com url eg.

Expected behavior

Examples

<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <a class="moz-txt-link-rfc2396E" href="http://www.example.com">"http://www.example.com"</a><br>
  </body>
</html>

Server (please complete the following information):

Additional context Notice: Thunderbird automatically creates the link in the example when someone uses "http://xyz.tld" in a normal text (without explicitly defining at as a link).

shawniverson commented 1 year ago

Still looking at this one, trying to get caught up on these and others.

Skywalker-11 commented 1 year ago

Adding $squashedtext =~ s/["'](.*)["']/$1/; here https://github.com/MailScanner/v5/blob/668cea1204851b77ae63aec46a07145b67f447dd/common/usr/share/MailScanner/perl/MailScanner/Message.pm#L7861 fixes the false positive for quotation marks at start/end of the link text.

These lines seem to be responsible for the different handling of .com tld and others. Do you know what the intention was for checking only these? https://github.com/MailScanner/v5/blob/668cea1204851b77ae63aec46a07145b67f447dd/common/usr/share/MailScanner/perl/MailScanner/Message.pm#L7857-L7858

shawniverson commented 1 year ago

@Skywalker-11 Those look like common tlds, albeit a limited list of them. I'm not sure what the intent was here, but they seem to be there to catch urls containing them, perhaps if the other conditions don't catch?

shawniverson commented 1 year ago

Feel free to create a PR. I think your suggested substitution is good.

Skywalker-11 commented 1 year ago

The PR fixes the quoting issue but the different actions for those specific TLDs should probably still be looked at again.