Closed dausruddin closed 9 years ago
local function get_google_data(text)
local url = "http://ajax.googleapis.com/ajax/services/search/images?"
url = url.."v=1.0&rsz=5"
url = url.."&q="..URL.escape(text)
url = url.."&imgsz=small|medium|large"
if google_config.api_keys then
local i = math.random(#google_config.api_keys)
local api_key = google_config.api_keys[i]
if api_key then
url = url.."&key="..api_key
end
end
//
&&safe=active <- Use it to remove such content
Example
url = url.."v=1.0&rsz=5&&safe=active"
What people think if this should be made as option in img_google
plugin? For example could use command !img options:safesearch
or something similar?
@soend I think an option defeats the purpose of this issue, which is to keep NSFW content from being searched at all. !safeimg
would be nice, though.
I think that the best solution to do a filter is write a pre_process plugin that return nil if any word in a command is banned
I think here it would be best to rely on Google's SafeSearch, because it covers more than we can do effectively in a plugin. The bot-wide filter might work, but you'd have to cover a huge number of misspellings that get autocorrected in the search into banned words. Unfortunately, Giphy and Imgur don't offer safe searches.
I wrote this code and it is working great.
function on_msg_receive (msg)
if not started then
return
end
local receiver = get_receiver(msg)
local blockWords = {
"^!(.+)anal",
"^!(.+)anus",
"^!(.+)ass",
"^!(.+)beastiality",
"^!(.+)bisexual",
"^!(.+)blowjob",
"^!(.+)bondage",
"^!(.+)boner",
"^!(.+)breast",
"^!(.+)clit",
"^!(.+)clitoris",
"^!(.+)cock",
"^!(.+)cum",
"^!(.+)cunt",
"^!(.+)dick",
"^!(.+)dong",
"^!(.+)erotic",
"^!(.+)fuck",
"^!(.+)gay",
"^!(.+)hardon",
"^!(.+)hard",
"^!(.+)on",
"^!(.+)incest",
"^!(.+)lick",
"^!(.+)lust",
"^!(.+)nude",
"^!(.+)oral",
"^!(.+)penis",
"^!(.+)piss",
"^!(.+)playboy",
"^!(.+)porn",
"^!(.+)puss",
"^!(.+)queer",
"^!(.+)rectum",
"^!(.+)sex",
"^!(.+)shit",
"^!(.+)sleazy",
"^!(.+)slut",
"^!(.+)smut",
"^!(.+)softcore",
"^!(.+)sperm",
"^!(.+)suck",
"^!(.+)swingers",
"^!(.+)tit",
"^!(.+)tits",
"^!(.+)virgin",
"^!(.+)whore",
"^!(.+)x",
"^!(.+)rated",
"^!(.+)x-rated",
"^!(.+)fellatio",
"^!(.+)hardcore",
"^!(.+)hooker",
"^!(.+)hustler",
"^!(.+)intercourse",
"^!(.+)kama",
"^!(.+)sutra",
"^!(.+)kinky",
"^!(.+)lesbian",
"^!(.+)lesbo",
"^!(.+)masturbat",
"^!(.+)nudist",
"^!(.+)orgasm",
"^!(.+)panties",
"^!(.+)penthouse",
"^!(.+)prostitut",
"^!(.+)xxx",
"^!(.+)sodom",
"^!(.+)gomorrah",
"^!(.+)puki",
"^!(.+)pantat",
"^!(.+)kemaluan",
"^!(.+)nenen",
"^!(.+)burit",
"^!(.+)tetek",
"^!(.+)bogel",
"^!(.+)bohsia",
"^!(.+)rogol",
"^!(.+)vagina",
"^!(.+)semen",
"^!(.+)hymen",
"^!(.+)lucah",
"^!(.+)puting",
"^!(.+)buah dada",
"^!(.+)sangap",
"^!(.+)bugil",
"^!(.+)jilbab",
"^!(.+)jubo",
"^!(.+)jubur",
"^!(.+)jubor",
"^!(.+)kelentit",
"^!(.+)kelentik",
"^!(.+)telanjang",
"^!(.+)horny",
"^!(.+)pepek",
"^!(.+)cipap"
}
-- vardump(msg)
msg = pre_process_service_msg(msg)
if msg_valid(msg) then
vardump(msg)
msg = pre_process_msg(msg)
for k, blockWords in pairs(blockWords) do
msg.text = string.lower(msg.text)
local matches = match_pattern(blockWords, msg.text)
if matches then
send_msg(receiver, "English: Please, no pornographic contents.", ok_cb, false)
msg.text = ""
return msg
end
end
if msg then
match_plugins(msg)
mark_read(receiver, ok_cb, false)
end
end
end
I have tried word blacklist, but people are not stupid. They started using spaces between letters and bad words in other languages. Thats why i also think we should rely on google.
@bb010g, what i meant is that only privileged user could turn the safe search on and off. Problem what i have atm is every time there is update and i pull them from git i get conflicts because i have edited img_google
plugin and added url = url.."&safe=active"
.
Other option i have is to just make new plugin safe_img_google
what uses safe search.
@soend is that only for google query? How about !gif ?
For your problem, please refer https://github.com/yagop/telegram-bot/issues/196
That's what I was saying, it's ok to add to img that option, but we are talking here about a global word filter.
Now, I got many words to be added and doesnt seem efficient putting many lines of bad words into bot.lua. @rockneurotiko got some example to include another file into bot.lua?
@psycholyzern, Yes, that is only for the google query. There is some rating field in the giphy response data but no documentation what this means. From the readme: rating - limit results to those rated (y,g, pg, pg-13 or r).
Well need to be coded in each plugin with different method then. It can be more efficient. But I prefer manual word blocking just because I have coded it, and it can reply a custom message when user query any matched bad words. And it works globally of course.
@psycholyzern you don't need to type in bot.lua! just write a pre_process plugin, like stats, and return nil when you want to block the msg
I ended removing all changes. It is not efficient and hard to keep updated the words need to be blocked. Waiting for someone make a plugins for this.
Seriously... just rely on Googles safe-search. That'd cover lots of shits.
Yah.. Im using google safe search like users above me recomended. Writing the code manually and keep it updated took too much effort.
Again, using safe search is the solution for the img command, but what this issue wanted (or I interpreted that) is to block every message with words in a blacklist.
Yah.. I am thinking to build a remote database which can be queried from the bot's server. But the disadvantages is that the bot will do http request each time when a new message came. I dont know if this will affecting performance but still it is better than storing hundreds of words to be matched locally
A request is always much more slower than a local check... Maybe the plugin can download the db in the startup.
Isnt it will be slower (I mean, the performance of the server) if try to match a single word with hundreds of other words?
The remote server will have to match it anyway, so it will be the time of match + time of the request. Anyway, I don't think that will be many words, with the trick explained here you replace some letter tricks. I'm not a fan of this "feature", so, if someone want to block his users, will go slower (you can't avoid that) xD
Yah. Im started to hate this idea because it will slow down everything. Thats why I only use safe image for google and disable plugin boob. Dirt word = yes, bad image? = no It is fair enoughπ π π
If you're happy with your solution, feel free to close this issue.
I guess that a good end for this issue will be to add an option to img_search to use the safe search or not.
Sure.. Thanks to all of you for helping me ππ And @rockneurotiko yes, please add that option (make it default will be better :p) Thanks again
I really don't like when my users use some plugins for porn purpose.
Example:
!img naked women !gif boob
Is there anyway that I could block keywords that leading to porn contents? A plugin that can be turned on/off perhaps? Or just blocking those words permanently? And it is better if there is a message shown when user type a blocked word..