masperro / httplib2

Automatically exported from code.google.com/p/httplib2
0 stars 0 forks source link

email.message_from_string slows down cache retrieval for large documents #31

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
When large documents are retrieved from the cache, a lot of time is spent
parsing the cached response. (to the point where it is a lot faster if the
document is NOT in the cache)

I *think* this is unnecessary because as far as I can see, this is only
needed to get the headers. This means that email.message_from_string should
probably be called with only the first part of the cached response. (maybe
this also makes the change from revision 266 unnecessary?)

The following change seems to work:

    if cached_value:
        try:
            info, content = cached_value.split('\r\n\r\n', 1)
        except IndexError:
            self.cache.delete(cachekey)
            cachekey = None
            cached_value = None
        info = email.message_from_string(info)

Original issue reported on code.google.com by steven.v...@gmail.com on 23 Jul 2008 at 12:36

GoogleCodeExporter commented 8 years ago
Now doing the split before passing just the headers to email.FeedParser.

Original comment by joe.gregorio@gmail.com on 6 Sep 2008 at 5:02