cube-drone / pierc

A python bot that logs IRC channels, and a PHP/JS interface for browsing said logs.
http://classam.github.com/pierc/
Other
52 stars 24 forks source link

Text encoding failure #36

Open AntoineTurmel opened 9 years ago

AntoineTurmel commented 9 years ago

When someone use XChat to publish with ISO8859-1(5) and type "ç" it stores "ç" on the table main column "message" causing "Something Has Gone Spectacularly Wrong!" if the charset is set to UTF-8, it stores "ç" for "ç" and do not cause error...

Is it possible to handle both ?

frdmn commented 9 years ago

If I understand you right, only the front-end renders said character wrong? But the database contains the proper "ç"?

AntoineTurmel commented 9 years ago

yeah the front-end shows "Something Has Gone Spectacularly Wrong!" when it's a "ç" stored When "ç" is stored on the database, the front-end shows "ç"...

frdmn commented 9 years ago

What would happen if you try to convert to UTF-8 before writing into the database:

    query = "INSERT INTO main (channel, name, time, message, type, hidden) VALUES" + \
    "(\""+unicode(self.conn.escape_string(channel), 'utf-8')+ "\"," + \
    "\""+unicode(self.conn.escape_string(name), 'utf-8')+"\"," + \
    "\""+time+"\"," + \
    "\""+unicode(self.conn.escape_string(message), 'utf-8')+"\"," + \
    "\""+self.conn.escape_string(msgtype)+"\"," + \
    "\""+self.conn.escape_string(hidden)+"\")"

Keep the UTR-8 charset of the DB, though.

Edit: perhaps stringvariable.decode('utf-8', 'ignore') is a better approach since it doesn't throw errors in case of characters that it can't convert.

AntoineTurmel commented 9 years ago

I'll try thanks !

AntoineTurmel commented 9 years ago

Ok now my text is stored well, but I still have the error but resolved adding this in pierc_db.php: mysqli_set_charset($this->_conn, "utf8");

frdmn commented 9 years ago

Great. You can create a PR if you like. I think others might profit from those changes as well :)