pombreda / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

some warnings set bozo "bit" #229

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. parse for example http://www.groklaw.net/backend/GrokLaw.rdf

What is the expected output? What do you see instead?

exception is: documented declared as us-ascii, but parsed as iso-8859-1

however this "error" should be a warning (bozo=2??) as the feed is completely 
parsed and accessible

Original issue reported on code.google.com by chriscam...@googlemail.com on 26 Sep 2010 at 11:02

GoogleCodeExporter commented 9 years ago
Please close this bug as invalid.

bozo and bozo_exception are used for developers to debug when something goes 
wrong. Nothing went wrong in this specific case, so the information can be 
ignored.

Original comment by kurtmckee on 4 Dec 2010 at 4:18

GoogleCodeExporter commented 9 years ago

Original comment by adewale on 4 Dec 2010 at 11:04

GoogleCodeExporter commented 9 years ago
Is there then no way to detect this case?

Original comment by chriscam...@googlemail.com on 6 Dec 2010 at 10:27

GoogleCodeExporter commented 9 years ago
@chris: Actually, there is! All of the feedparser-specific errors can be 
detected based on their class:

if isinstance(result.bozo_exception, feedparser.CharacterEncodingOverride):
    # do something here
    pass

Or, if you want to ignore almost all of the feedparser-specific errors, they're 
almost all a subclass of feedparser.ThingsNobodyCaresAboutButMe:

if not isinstance(result.bozo_exception, 
feedparser.ThingsNobodyCaresAboutButMe):
    # this is a bigger deal, like maybe a SAXParserException
    pass

Again, this information is necessary to developers when something goes really 
wrong because the server's HTTP headers don't agree with the XML declaration, 
or because the XML declaration doesn't agree with the feed's byte order marks, 
etc. In this specific case (ASCII vs ISO 8859-1), it's probably no big deal.

I hope that helps!

Original comment by kurtmckee on 6 Dec 2010 at 5:31