Charcoal-SE / SmokeDetector

Headless chatbot that detects spam and posts links to it in chatrooms for quick deletion.
https://metasmoke.erwaysoftware.com
Apache License 2.0
477 stars 182 forks source link

Phone number normalization should cope with embedded HTML tags #13774

Closed tripleee closed 4 weeks ago

tripleee commented 4 weeks ago

This has been discussed repeatedly, so I thought it would make sense to create an issue.

Some recent phone number spam has embedded formatting codes in the phone numbers, which prevents the phone number normalization from recognizing an already-listed phone number (watched or blacklisted).

There is an ad-hoc watch for this particular pattern which helps catch those posts anyway, but arguably it would be better in many ways if the actual phone numbers were properly recognized.

Recent example: https://m.erwaysoftware.com/posts/uid/salesforce/427484

Recent chat: https://chat.stackexchange.com/transcript/message/66534701#66534701

makyen commented 4 weeks ago

Resolved by https://github.com/Charcoal-SE/SmokeDetector/commit/5c8a2aaf2d7849194fe970757fc6aea969931846