ericholscher / django-crawler

A crawler using the Django Test Client
http://django-crawler.readthedocs.org
30 stars 10 forks source link

New plugin Saver #1

Open satels opened 13 years ago

satels commented 13 years ago

To create a static version of the site, I propose to add plugin:

coding:utf8

from crawler.plugins import Plugin
import os
import urlparse
class Saver(Plugin):

    def post_request(self, sender, response, url=None, **kwargs):
        if response.status_code == 200:
            content = response.content
            is_html = response['Content-Type'].startswith('text/html')
            path = urlparse.urljoin(self.output_dir + '/', url[1:])
            if is_html:
                basename = 'index.html'
                dirname = path
            else:
                basename = os.path.basename(path)
                dirname = os.path.dirname(path) + '/'
            try:
                os.makedirs(dirname)
            except OSError:
                pass
            file(dirname + basename, 'w').write(content)

PLUGIN = Saver

Thanks!