libo26 / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

Improve performance of character set detection #235

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
While not feedparser related, feedparser relies on universal detector to detect 
the character set if the input does not match the first time around.

The attached patch will limit the detection to run on the first 200K of the 
feed.  This greatly improves performance when someone hands you a feed that 
contains every post they've ever made (or worse yet gives you a URL to a movie).

Original issue reported on code.google.com by EpsilonP...@hotmail.com on 3 Dec 2010 at 10:07

Attachments:

GoogleCodeExporter commented 9 years ago
I'm marking this patch as invalid since it looks like it belongs in the Chardet 
project: http://code.google.com/p/chardet/

Can you send it there instead?
Thanks

Original comment by adewale on 4 Dec 2010 at 10:27

GoogleCodeExporter commented 9 years ago
In fact I suspect it's related to: 
http://code.google.com/p/chardet/issues/detail?id=1 which I already filed there

Original comment by adewale on 4 Dec 2010 at 10:45