Suspicious database record matching rule "SEO spam"

wordfence / wordfence-cli

Wordfence malware and vulnerability scanner command line utility.

https://www.wordfence.com/products/wordfence-cli/

GNU General Public License v3.0

104 stars 22 forks source link

Suspicious database record matching rule "SEO spam" #308

Open fabiomb opened 2 days ago

fabiomb commented 2 days ago

I don't understand the criteria of the rule, but I tried the new cli scan for database and a lot of my articles gives "Suspicious database record found in table "wp_posts" matching rule "SEO spam":..." But there's no problem at the articles cited, just a suspicious wtf, too many links? I don't understand.

then i got a lot of "JavaScript/charcode checks" when I have some twitter posts in the post_content 🤷

some ideas? I got more than 500+ results and can't find a single real problem

akenion commented 1 day ago

The "SEO spam" rule (id 13) looks for a few suspicious CSS rules and also looks at post content or phrases and keywords that are commonly found in content that is SEO spam.

The "JavaScript/charcode checks" rule (id 12) looks for embedded JavaScript or calls to fromcharcode in post content. Such content is usually malicious.

I will mention these rules to our team and see if they should be adjusted. Would you be willing to share any examples of matching posts that you believe to be false positives?

If rules are consistently returning false positives, you can also exclude them from the scan using the -e / --exclude-rules option. For example, to exclude the two rules you mentioned from the scan and just use the remaining rules, you can add -e 12 -e 13 to the db-scan command. I don't generally recommend excluding rules as it can lead to missing actual results, but it is an option.

fabiomb commented 1 day ago

thanks @akenion the two false positives are with SEO plugins (Rank Math SEO in my case) and the javascript one reacts when you have some JS like the Twitter embed (old method) or YouTube (old method), so yes, it's clear that could be the case

akenion commented 1 day ago

That makes sense regarding the Twitter and YouTube embeddings. In your case, I'd advise simply excluding rule 12 ("JavaScript/charcode checks") for the time being or finding an alternate way to embed that content.

Can you provide a specific match for the Rank Math SEO case (an example output row from running db-scan)? Reviewing the rule, I'm not immediately following how that plugin would generate a false positive.

fabiomb commented 1 day ago

results.csv here's the results.csv with all the scan export, there's a lot of positives in really old content, but in the newer it's strange to find it

akenion commented 12 hours ago

Thanks for sharing the results. It does look like you have legitimate content that includes keywords that are often found in SEO spam and hence the signature matches. Our team is reviewing this internally to see if any improvements can be made around this, but for now, I do recommend excluding the two problematic rules (using -e 12 -e 13 as mentioned earlier) since they are generating a high number of false positives on your site. I will post here if that recommendation changes after further review and discussion.