Closed ghost closed 8 years ago
I want to be able to ban unicode PM spam so very much. Not knowing what the spam contained was a problem.
@Technetium1 , I think what is essential when banning unicode in particular is the raw unicode number. I mean just for example saying if '╚' in msg: self.send_ban_msg(self.user.nick, self.user.id)
might not be enough. We want to produce a small catalog (if you intend on banning someone who sends a particular unicode symbol).
So ╚ would be equal to sending u'\u255A'
in python.
@GoelBiju is there a site in particular you use for finding the python equivalents?
@Technetium1 , most of the time I have consulted file format info's unicode section to find the python equivalents; their HTML codes and equivalents for other languages are also available. I would try it in a Python IDE before implementing it.
NOTE: It's worth just copying the actual unicode symbol from the site and pasting it into the client in any room to see if it renders properly. Some of the popular unicode render, while others may display a placeholder.
All-in-all, like @Autotonic said, it is worthwhile for @nortxort to think about rendering unicode and writing it to the log file for reference.
@Technetium1 use http://graphemica.com/ Can copy and paste the character into the search, scroll down and there is a "Python: " bit.
Thanks @GoelBiju and @Autotonic!
This has been added now. Thanks @Autotonic
https://github.com/nortxort/pinylib/blob/master/pinylib.py#L63
fh.file_writer(path, file_name, msg.encode('ascii', 'ignore'))
Currently any unicode is just left out.Let's log that too aye?
fh.file_writer(path, file_name, msg.encode(encoding='UTF-8',errors='ignore')
Confirmed working under Linux, should be fine for Windows as well.