jaraco / irc

Full-featured Python IRC library for Python.
MIT License
392 stars 86 forks source link

incoming lines are not filter for malformed unicode #56

Closed jaraco closed 8 years ago

jaraco commented 8 years ago
#!text

UnicodeEncodeError: 'charmap' codec can't encode characters in position 106-112: character maps to <undefined
>                                                                                                            
Call stack:                                                                                                  
  File "****.py", line 112, in <module>                            
    main()                                                                                                  
  File "****.py", line 108, in main                                
    irc_bot.start()                                                                                          
  File "C:\Python34\lib\site-packages\irc\bot.py", line 265, in start                                        
    super(SingleServerIRCBot, self).start()                                                                  
  File "C:\Python34\lib\site-packages\irc\client.py", line 1274, in start                                    
    self.reactor.process_forever()                                                                          
  File "C:\Python34\lib\site-packages\irc\client.py", line 276, in process_forever                          
    self.process_once(timeout)                                                                              
  File "C:\Python34\lib\site-packages\irc\client.py", line 257, in process_once                              
    self.process_data(i)                                                                                    
  File "C:\Python34\lib\site-packages\irc\client.py", line 214, in process_data                              
    c.process_data()                                                                                        
  File "C:\Python34\lib\site-packages\irc\client.py", line 581, in process_data                              
    log.debug("FROM SERVER: %s", line)                                                                      
Message: 'FROM SERVER: %s'                                                                                  
Arguments: (':****!PircBot@dev-742952AB.cpe.pppoe.ca PRIVMSG #**** :\x91\x91\x91\x91\x91
\x91\x91',)

not sure what the proper fix is, personally i'd probably throw in a

#!python

line = line.decode("utf-8", "ignore")

or something to that effect near line 580


jaraco commented 8 years ago

nevermind, failed to see how the buffer stuff worked.


Original comment by: thealok

jaraco commented 8 years ago

No worries. A lot of users hit this issue. I'm coming to the idea that maybe the Lenient decoding buffer should be used instead. Or maybe one that tries different encodings. Unfortunately, there's no one-size-fits-all solution, so I'm inclined to have the default behavior be to fail.


Original comment by: Jason R. Coombs